Near-optimal ground state preparation

Preparing the ground state of a given Hamiltonian and estimating its ground energy are important but computationally hard tasks. However, given some additional information, these problems can be solved efficiently on a quantum computer. We assume that an initial state with non-trivial overlap with the ground state can be efficiently prepared, and the spectral gap between the ground energy and the first excited energy is bounded from below. With these assumptions we design an algorithm that prepares the ground state when an upper bound of the ground energy is known, whose runtime has a logarithmic dependence on the inverse error. When such an upper bound is not known, we propose a hybrid quantum-classical algorithm to estimate the ground energy, where the dependence of the number of queries to the initial state on the desired precision is exponentially improved compared to the current state-of-the-art algorithm proposed in [Ge et al. 2019]. These two algorithms can then be combined to prepare a ground state without knowing an upper bound of the ground energy. We also prove that our algorithms reach the complexity lower bounds by applying it to the unstructured search problem and the quantum approximate counting problem.


Introduction
Estimating ground energy and obtaining information on the ground state of a given quantum Hamiltonian are of immense importance in condensed matter physics, quantum chemistry, and quantum information. Classical methods suffer from the exponential growth of the size of Hilbert space, and therefore quantum computers are expected to be used to overcome this difficulty. However even for quantum computer, estimating the ground energy is a hard problem: deciding whether the smallest eigenvalue of a generic local Hamiltonian is greater than b or smaller than a for some a < b is QMA-complete [26,24,35,1].
Therefore to make the problem efficiently solvable we need more assumptions. We denote the Hamiltonian we are dealing with by H, and consider its spectral decomposition H = k λ k |ψ k ψ k | where λ k ≤ λ k+1 . The key assumption is that we have an initial state |φ 0 which can be efficiently prepared by an oracle U I , and has some overlap with the ground state |ψ 0 lower bounded by γ. This is a reasonable assumption in many practical scenarios. For instance, even for strongly-correlated molecules in quantum chemistry, there is often a considerable overlap between the true ground state and the Hartree-Fock state. The latter can be trivially prepared in the molecular orbital basis, and efficiently prepared in other basis [27]. For the moment we also assume the spectral gap is bounded from below: λ 1 − λ 0 ≥ ∆.
With these assumptions we can already use phase estimation coupled with amplitude amplification [11] to prepare the ground state, if we further know the ground energy to high precision. To our knowledge, the most comprehensive work on ground state preparation and ground state energy estimation was done by Ge et al. [19], which provided detailed complexity estimates for well-known methods such as phase estimation, and proposed new methods to be discussed below. As analyzed in [19,Appendix A], in order to prepare the ground state to fidelity 1 1 − , the runtime of the controlled-time-evolution of the Hamiltonian is O(1/(γ 2 ∆ )) 2 , and the number of queries to U I is O(1/γ), assuming the spectral norm of H is bounded by a constant. This is however far from optimal. Poulin and Wocjan [38] proposed a method that, by executing the inverse of phase estimation to filter out the unwanted components in the initial state, can prepare a state whose energy is in a certain given range. A different choice of parameters yields a way to prepare the ground state to fidelity 1 − by running the controlled-time-evolution of the Hamiltonian with O(1/(γ∆) log(1/ )) runtime, and using O(1/γ) queries to U I [19,Appendix C].
A key difference between ground state preparation and Hamiltonian simulation, where significant progress has been made in recent years [29,8,7,30,32,31,15], is its non-unitary nature. The recent development of linear combination of unitaries (LCU) method [8,13] provided a versatile tool to apply non-unitary operators. Using LCU, Ge et al. proposed a new method to filter the initial state by applying a linear combination of time-evolutions of different time length [19], which achieves the same complexity, up to logarithmic factors, as the modified version of Poulin and Wocjan's method discussed above.
All of the above methods prepare the ground state assuming the ground energy is known to high precision. When the ground energy is unknown, Ge et al. proposed a method to estimate the ground energy using a search method called minimum label finding [19]. This method can estimate the ground energy to precision h by running the controlledtime-evolution of the Hamiltonian for O(1/(γh 3/2 )) 3 , and querying U I O(1/(γ √ h)) times. It is worth noting that their method requires h = O(∆), and therefore is very expensive when the gap is extremely small. When the ground energy is not known a priori , Ge et al. proposed a method to first estimate the ground energy and then apply the LCU approach.
In recent years several hybrid quantum-classical algorithms have been developed to estimate the ground energy, or to prepare the ground state, or both. The variational quantum eigenvalue solver (VQE) [37] has gained much attention recently because of its low requirement for circuit depth and its variational structure. However the exact complexity of this algorithm is not clear because it relies on a proper choice of ansatz and needs to solve a non-convex optimization problem. Other such algorithms include quantum imaginary-time evolution, quantum Lanczos [33], and quantum filter diagonalization [36,39]. Their complexities are either quasi-polynomial or unknown.
The recent development of block-encoding [8] and quantum signal processing (QSP) [20,30] enables us to apply non-unitary operators, specifically polynomials of a blockencoded matrix efficiently. It uses a minimal number of ancilla qubits, and avoids the Hamiltonian simulation. These will be the basic tools of this work, of which we give a brief introduction below.
Block-encoding is a powerful tool to represent a non-unitary matrix in the quantum circuit. A matrix A ∈ C N ×N where N = 2 n can be encoded in the upper-left corner of an 1 In this work, the fidelity between states |x , |y is defined to be | x|y |.
2 In this work the notation O(f ) means O(f poly log(f )) unless otherwise stated. 3 In [19], the meaning of the notation O(·) is different from that in our work. In particular, O(·) in [19] hides all factors that are poly-logarithmic in 1/h, 1/ , 1/γ, and 1/∆, regardless of what is inside the parentheses. We preserve their notation when citing their results since these factors do not play an important role when comparing the complexities of our methods.

(m + n)-qubit unitary matrix if
(1) In this case we say U is an (α, m, )-block-encoding of A. Many matrices of practical interests can be efficiently block-encoded. In particular we will discuss the block-encoding of Hamiltonians of physical systems in Section 7.
Using the block-encoding of a Hermitian A, QSP enables us to construct blockencodings for a large class of polynomial eigenvalue transformations of A. We pay special attention to even or odd polynomials with real coefficients, because we only apply this type of polynomial eigenvalue transformation in this work. Also for simplicity we assume the block-encoding is done without error. [20,Theorem 31] enables us to perform eigenvalue transformation with polynomials of arbitrary parity, but because of the above discussion, we modify the theorem to the following form, which is easily proven using [20,Corollary 11].
Constructing the quantum circuit for QSP requires computing a sequence of phase factors beforehand, and there are classical algorithms capable of doing this [22]. Some recent progress has been made to efficiently compute phase factors for high-degree polynomials to high precision [17]. In this work we assume the phase factors are computed without error.
Using the tools introduced above, we assume the Hamiltonian H is given in its (α, m, 0)block-encoding U H . This, together with U I , are the two oracles we assume we are given in this work. QSP enables us to filter eigenstates using fewer qubits than LCU. In [28] a filtering method named optimal eigenstate filtering is introduced. It is based on an explicitly constructed optimal minimax polynomial, and achieves the same asymptotic complexity, ignoring poly-logarithmic factors, as the method by Ge et al. when applied to the ground state preparation problem if the ground energy is known exactly.
In this work we first develop a filtering method that filters out all eigenstates corresponding to eigenvalues above a certain threshold. This filtering method enables us to prepare the ground state of a Hamiltonian with spectral gap bounded away from zero when only an upper bound of the ground energy is known, unlike in the filtering methods discussed above which all require either exact value or high-precision estimate of the ground energy. Our filtering method has an exponentially improved dependence on precision compared to Kitaev's phase estimation [25] and uses fewer qubits compared to other variants of the phase estimation algorithm [38,19]. This filtering method, applied to the initial state given in our assumption, also enables us to tell whether the ground energy is smaller than a or greater than b for some b > a, with high probability. Therefore a binary search yields a ground energy estimate with success probability arbitrarily close to one. We then combine the filtering method and ground energy estimation to prepare the ground state when no non-trivial bound for the ground energy is known. A comparison of the query complexities between the method in our work and the corresponding ones in [19], which to our best knowledge achieve state-or-the-art query complexities, are shown in Table 1.
From the query complexities in Table 1 we can see our method for ground energy estimation achieves a exponential speedup in terms of the dependence of number of queries   [19]. α, γ, ∆, are the same as above and h is the precision of the ground energy estimate. By extra qubits we mean the ancilla qubits that are not part of the block-encoding. In this work the ground energy estimation algorithm and the algorithm to prepare ground state without a priori bound have success probabilities lower bounded by 1 − ϑ, while in [19] the corresponding algorithms have constant success probabilities. The complexities for algorithms by Ge et al. are estimated assuming Hamiltonian simulation is done as in [31]. The usage of the notation O is [19] different from that in our work, as explained in footnote 3.
to U I on the ground energy estimate precision h and a speedup of 1/ √ h factor in the dependence of number of queries to U H on the precision. Moreover, Ge et al. assumes in their work that the precision h = O(∆), while we make no such assumptions. This gives our algorithm even greater advantage when the gap is much smaller than desired precision. This becomes useful in the case of preparing a low energy state (not necessarily a ground state). Because Ge et al. used a slightly different query assumption, i.e. access to time-evolution rather than block-encoding, when computing the complexities for methods in [19] in Table 1 we assume the Hamiltonian simulation is done with O(αt) queries to U H , and the error is negligible. This can be achieved using the Hamiltonian simulation in [31], and cannot be asymptotically improved because of the complexity lower bound proved in [8]. Therefore the comparison here is fair even though our work makes use of a different oracle. Also [19] assumed a scaled Hamiltonian H with its spectrum contained in [0, 1]. We do not make such an assumption, and therefore the α factor should be properly taken into account as is done in Table 1.
Organization: The rest of the paper is organized as follows. In Section 2 we use QSP to construct block-encodings of reflectors and projectors associated with eigen-subspaces. In Section 3 we use the projectors to prepare ground state when an upper bound of the ground energy is given. In Section 4 we introduce the ground energy estimation algorithm, a hybrid quantum-classical algorithm based on the binary search, and use it to prepare the ground state when no ground energy upper bound is known a priori . In Section 5 we show the dependence of our query complexities on the overlap and gap is essentially optimal by considering the unstructured search problem. We also show the dependence of our ground energy estimation algorithm on the precision is nearly optimal by considering the quantum approximate counting problem. In Section 6 we use our methods to prepare low-energy states when the spectral lower gap is unknown, or even when the ground state is degenerate. In Section 7 we discuss practical issues and future research directions.

Block-encoding of reflector and projector
A key component in our method is a polynomial approximation of the sign function in the domain [−1, −δ] ∪ [δ, 1]. The error scaling of the best polynomial approximation has been studied in [18], and an explicit construction of a polynomial with the same error scaling is provided in [30] based on the approximation of the erf function. We quote [20,Lemma 14] here with some small modification: [20,Lemma 29] for any µ ∈ R that is not an eigenvalue. Then using QSP, by Theorem 1, we can obtain an (1, m + 2, 0)-block-encoding of −S( H−µI α+|µ| ; δ, ) for any δ and . If we assume further that ∆/2 ≤ min k |µ − λ k |, then we let δ = ∆ 4α , and by Lemma 2 all the eigenvalues of −S( H−µI α+|µ| ; δ, ) are -close to either 0 or 1. Therefore −S( H−µI α+|µ| ; δ, ) is -close, in operator norm, to the reflection operator about the direct sum of eigen-subspaces corresponding to eigenvalues smaller than µ: and thus the block-encoding is also an (1, m + 2, )-block-encoding of R <µ . We denote this block-encoding by REF(µ, δ, ). We omitted the dependence on H because H as well as its block-encoding is usually fixed in the rest of the paper.
Because our goal is to prepare the ground state, we will use the projector more often than the reflector. Now we construct a block-encoding of projector using REF(µ, δ, ) by the following circuit where H is the Hadamard gate, and we denote this circuit as PROJ(µ, δ, ). It can be checked that where P <µ is the projector into the direct sum of eigen-subspaces corresponding to eigenvalues smaller than µ Therefore PROJ(µ, δ, ) is an (1, m + 3, /2)-block-encoding of P <µ . In fact this can still be seen as an application of linear combination of block encoding [20,Lemma 29], using the relation P <µ = 1 2 (R <µ + I). We use the following lemma to summarize the results Lemma 3 (Reflector and projector). Given a Hermitian matrix H with its (α, m, 0)block-encoding U H , with the guarantee that µ ∈ R is separated from the spectrum of H by a gap of at least ∆/2, we can construct an (1, m + 2, )-block-encoding of R <µ , and an ( 1 )) other one-and two-qubit gates.
We remark that for the block-encoding PROJ(µ, δ, ), even a failed application of it can give us potentially useful information. We have where P >µ = I −P <µ and |E satisfies |E ≤ . Thus when we apply the block-encoding and measure the first two registers, i.e. the first m + 3 qubits, we have probability at least 1− 2 2 to obtain an outcome with either 0 or 1 followed by (m+2) 0's. In the former case the projection has been successful, and in the latter case we have obtained an approximation of P >µ |φ .
If we do not treat the output of 1 followed by m + 2 0's as failure then there is another interpretation of the circuit PROJ(µ, δ, ): this is an approximate projective measurement {P <µ , P >µ }. In fact the whole circuit can be seen as phase estimation on a reflection operator, which needs only one ancilla qubit.

Algorithm with a priori ground energy bound
With the approximate projector developed in the previous section we can readily design an algorithm to prepare the ground state. We assume we have the Hamiltonian H given through its block-encoding as in the last section. If we are further given an initial state |φ 0 prepared by a unitary U I , i.e. U I |0 n = |φ 0 , and the promises that for some known γ > 0, µ, and ∆, we have (P1) Lower bound for the overlap: | φ 0 |ψ 0 | ≥ γ, (P2) Bounds for the ground energy and spectral gap: Here µ is an upper bound for the ground energy, ∆ is a lower bound for the spectral gap, and γ is a lower bound for the initial overlap. Now suppose we want to prepare the ground state to precision , we can use Lemma 3 to build a block-encoding of the projector P <µ = |ψ 0 ψ 0 |, and then apply it to |φ 0 which we can prepare. This will give us something close to |ψ 0 . We use fidelity to measure how close we can get. To achieve 1 − fidelity we need to use circuit PROJ(µ, ∆/4α, γ ), and we denote, then the resulting fidelity will be Here we have used This is when we have a successful application of the block-encoding. The success probability is With amplitude amplification [11] we can boost the success probability to Ω(1) with O( 1 γ ) applications of PROJ(µ, ∆/4α, γ ) and its inverse, as well as O( m γ ) other one-and twoqubit gates. Here we are describing the expected complexity since the procedure succeeds with some constant probability. In amplitude amplification we need to use a reflector similar to the oracle used in Grover's search algorithm [21]. Instead of constructing a reflector from PROJ(µ, ∆/4α, γ ) we can directly use REF(µ, ∆/4α, γ ) constructed in the previous section.
We summarize the results in the following theorem Theorem 4 (Ground state preparation with a priori ground energy bound). Suppose

Algorithm without a priori ground energy bound
Next we consider the case when we are not given a known µ to bound the ground energy from above. All other assumptions about H and its eigenvalues and eigenstates are identical to the previous sections. The basic idea is to test different values for µ and perform a binary search. This leads to a quantum-classical hybrid method that can estimate the ground energy as well as preparing the ground state to high precision. All eigenvalues must be in the interval [−α, α], thus we first partition [−α, α] by grid points −α = x 0 < x 1 < . . . < x G = α, where x k+1 − x k = h for all k. Then we attempt to locate λ 0 in a small interval between two grid points (not necessarily adjacent, but close) through a binary search. To do a binary search we need to be able to tell whether a given x k is located to the left or right of λ 0 . Because of the random nature of measurement we can only do so correctly with some probability, and we want to make this probability as close to 1 as possible. This is achieved using a technique we call binary amplitude estimation.
Proof. The proof is essentially identical to the proof for gapped phase estimation in [2,13]. We can perform amplitude estimation up to error ∆/4 with O(1/∆) applications of U and U † . This has a success probability of 8/π 2 according to Theorem 12 of [11]. We turn the estimation result into a boolean indicating whether it is larger or smaller than (γ 0 + γ 1 )/2. The boolean is correct with probability at least 8/π 2 . Then we do a majority voting to boost this probability. Chernoff bound guarantees that to obtain a 1 − δ probability of getting the correct output we need to repeat O(log(1/δ)) times. Therefore in total we need to run U and U † O((1/∆) log(1/δ)) times.
We then apply binary amplitude estimation to the block-encoding of the projector defined in (2) PROJ(x k , h/2α, ) for some precision to be chosen. We denote the amplitude of the "good" component after applying block-encoding by which satisfies the following: We can then let = γ/2, the two amplitudes are separated by a gap lower bounded by γ/2. Therefore we can run the binary amplitude estimation, letting U in Lemma 5 be to correctly distinguish the two cases where λ 0 ≤ x k−1 and λ 0 ≥ x k+1 with probability 1 − δ, by running PROJ(x k , h/2α, ), U I , and their inverses O((1/γ) log(1/δ)) times. The output of the binary amplitude estimation is denoted by B k . We then define E as the event that an error occurs in the final result of binary amplitude estimation when we are computing B k for some k such that x k+1 < λ 0 or x k−1 > λ 0 in our search process. All future discussion is conditional on E c meaning that there is no error in binary amplitude estimation for B k when x k+1 < λ 0 or x k−1 > λ 0 . This has a probability that is at least (1 − δ) R where R is the number of times binary amplitude estimation is run.
Conditional on E c , almost surely (with probability 1) B k = 1 when λ 0 ≤ x k−1 and B k = 0 when λ 0 ≥ x k+1 . Therefore B k = 0 tells us λ 0 > x k−1 and B k = 1 tells us λ 0 < x k+1 . B k and B k+1 combined give us the information as shown in Table 2. Using the Table 2 we can do the binary search as outlined in Algorithm 1. It is easy to show that the algorithm must terminate in log 2 (G) = O(log(α/h)) steps. The output we denote as L and U . They satisfy x L < λ 0 < x U and U − L ≤ 3.
If we want the whole procedure to be successful with probability at least 1 − ϑ, then we need Prob(E c ) ≥ 1 − ϑ. Since Algorithm 1 enables us to locate λ 0 within an interval of length at most 3h. In total we need to run binary amplitude estimation at most O(log(α/h)) times. Each amplitude estimation queries PROJ(x k , h/2α, ) and U I O((1/γ) log(1/δ)) times, where = γ/2. Therefore the number of queries to U H and U I are respectively In particular, in the procedure above we did not use (P2) but only used (P1). Therefore we do not need to assume the presence of a gap. The result can be summarized into the following theorem: Theorem 6 (Ground energy). Suppose we have Hamiltonian H = k λ k |ψ k ψ k | ∈ C N ×N , where λ k ≤ λ k+1 , given through its (α, m, 0)-block-encoding U H . Also suppose we have an initial state |φ 0 prepared by circuit U I , as well as the promise (P1). Then the ground energy can be estimated to precision h with probability 1 − ϑ with the following costs:

Other one-and two-qubit gates:
The extra O(log(1/γ)) qubits needed come from amplitude estimation, which uses phase estimation. If we use Kitaev's original version of phase estimation using only a single qubit [25], we can reduce the number of extra qubits to O(1). With Theorem 6 we can then use Algorithm 1 to prepare the ground state without knowing an upper bound for the ground energy beforehand, when in addition to (P1) we have a lower bound for the spectral gap: (P2') Bound for the spectral gap: λ 1 − λ 0 ≥ ∆.
We first run Algorithm 1 to locate the ground energy in an interval [x L , x U ] of length at most ∆. Then we simply apply PROJ((x L + x U )/2, ∆/4α, γ ) to |φ 0 . This will give us an approximate ground state with at least 1 − fidelity. Therefore we have the following corollary: Corollary 7 (Ground state preparation without a priori bound). Suppose we have Hamiltonian H = k λ k |ψ k ψ k | ∈ C N ×N , where λ k ≤ λ k+1 , given through its (α, m, 0)-blockencoding U H . Also suppose we have an initial state |φ 0 prepared by circuit U I , as well as the promises (P1) and (P2'). Then the ground state can be can be prepared to fidelity 1 − with probability 1 − ϑ with the following costs:

Other one-and two-qubit gates:
It may be sometimes desirable to ignore whether the procedure is successful or not. In this case we will see the output as a mixed state whose density matrix is where | ψ 0 is the approximate ground state with fidelity at least 1 − , which is produced conditional on the event E c , and Trρ = Prob(E). Then this mixed state will have a fidelity lower bounded by If we want to achieve √ 1 − ξ fidelity for the mixed state, we can simply let ϑ = = ξ/3. Thus the number of queries to U H and U I are O( α γ∆ log( 1 ξ )) and O( 1 γ log( α ∆ ) log( 1 ξ )) respectively.

Optimality of the query complexities
In this section we prove for the ground state preparation algorithms outlined in Section 3 and Section 4 the number of queries to U H and U I are essentially optimal. We will also show our ground energy estimation algorithm has an nearly optimal dependence on the precision. We first prove the following complexity lower bounds: Proof. We prove all three lower bounds by applying the ground state preparation algorithm to the unstructured search problem. In the unstructured search problem we try to find a n-bit string t marked out by the oracle

Theorem 8. Suppose we have a generic Hamiltonian
It is proved for this problem the number of queries to U t to find t with probability 1/2 is lower bounded by Ω( √ N ) where N = 2 n [6,10,21]. This problem can be seen as a ground state preparation problem. It is easy to see that |t is the ground state of U t , which is at the same time a unitary and therefore an (1, 0, 0)block-encoding of itself. Therefore U t serves as the U H in the theorem. The spectral gap is 2. Also, let be the uniform superposition of all n-strings, then we have u|t = 1 √ N , and |u can be efficiently prepared by the Hadamard transform since H ⊗n |0 n = |u . Therefore H ⊗n serves as the U I described in the theorem.
If the ground state preparation problem can be solved with o(1/γ) queries to U H for fixed ∆ to produce an approximate ground state with fidelity at least √ 3/2, then from the above setup we have γ = 1/ √ N , and we can first find the approximate ground state and then measure in the computational basis, obtaining t with probability at least 3/4. Therefore the unstructured search problem can be solved with o( √ N ) queries to the oracle U t , which is impossible. Thus we have proved the first lower bound in our theorem.
To prove the second lower bound we want to create a situation in which the overlap is bounded from below by a constant but the gap vanishes. We need to introduce the Grover diffusion operator D = I n − 2 |u u| .
which can be efficiently implemented. Then we define and consider H(1/2). It is easy to see that the ground state of H(1/2) is and therefore Ψ|u = Ψ|t = 1/ √ 2 + O(1/ √ N ) for large N . Furthermore, the gap is ∆(1/2) = 2/ √ N . Therefore |t can be prepared in the following way: we first prepare the ground state of H(1/2), whose block-encoding is easy to construct using one application of U t . The resulting approximate ground state we denote as | Ψ . Then we measure | Ψ in the computational basis. If there is some non-vanishing probability of obtaining t then we can boost the success probability to above 1/2 by repeating the procedure and verifying using U t .
If the second lower bound in the theorem does not hold, then | Ψ can be prepared with o(1/∆(1/2)) = o( √ N ) queries to the block-encoding of H(1/2) and therefore the same number of queries to U t . Because the angle corresponding to fidelity is the great-circle distance on the unit sphere, we have the triangle inequality (using that | Ψ|Ψ | ≥ √ 3/2) arccos | Ψ|t | ≤ arccos | Ψ|t | + arccos | Ψ|Ψ | ≤ 5π 12 Therefore for large N we have | Ψ|t | ≥ cos(5π/12) + O(1/ √ N ) > 1/4. The probability of getting t when performing measurement is at least 1/16. Therefore we can boost the success probability to above 1/2 by O(1) repetitions and verifications. The total number of queries to U t is therefore o( √ N ). Again, this is impossible. Therefore we have proved the second lower bound in our theorem.
For the last lower bound we need to create some trade off between the gap and overlap. We consider preparing the ground state of the Hamiltonian H(1/2−N −1/2+δ ), 0 < δ < 1/6, whose block-encoding can be efficiently constructed with a single application of U t , as an intermediate step. It is shown in Appendix A that the ground state is Therefore Also we show in Appendix A that the gap is We first apply the algorithm described in Section 3 to prepare the ground state of H(1/2 − N −1/2+δ ) to fidelity 1 − N −2δ /128. Using the overlap γ u and the gap in (6), the approximate ground state, denoted by | Φ , can be prepared with O(N 1/2−δ log(N )) queries to the block-encoding of H(1/2 − N −1/2+δ ), and therefore the same number of queries to U t .
The overlap between | Φ and |t can again be bounded using the triangle inequality Therefore we have If the last lower bound in our theorem does not hold, we can then prepare the ground state of U t by using the initial state | Φ only O(1/ γ 1−θ t ) times for some θ > 0, and the number of queries to U t at this step, i.e. not including the queries used for preparing | Φ , is O(1/ γ p t ) for some p > 0. Therefore the total number of queries to U t is This complexity must be Ω(N 1/2 ) according to the lower bound for unstructured search problem. Therefore we need δp ≥ 1/2. However we can choose δ to be arbitrarily small, and no finite p can satisfy this condition. Hence we have a contradiction. This proves the last lower bound in our theorem.
When we look at the query complexities of the ground state preparation algorithms in Secs. 3 and 4, we can use O notation to hide the logarithmic factors, and both algorithms use O( α γ∆ ) queries to U H and O( 1 γ ) queries to U I when we want to achieve some fixed fidelity. Given the lower bound in Theorem 8 we can see the algorithm with a priori bound for ground energy essentially achieves the optimal dependence on γ and ∆. The algorithm without a priori bound for ground energy achieves the same complexity modulo logarithmic factors, while using less information. This fact guarantees that the dependence is also nearly optimal.
We will then prove the nearly optimal dependence of our ground energy estimation algorithm on the precision h. We have the following theorem: where λ k ≤ λ k+1 , given through its (α, m, 0)-block-encoding U H , and α = Θ(1). Also suppose we have an initial state |φ 0 prepared by circuit U I , as well as the promise that | φ 0 |ψ 0 | = Ω(1). Then estimating the ground energy to precision h requires Ω(1/h) queries to U H . This time we convert the quantum approximate counting problem, which is closely related to the unstructured search problem, into an eigenvalue problem. The quantum approximate counting problem is defined in the following way. We are given a set of n-bit strings S ⊂ {0, 1} n specified by the oracle U f satisfying for any x ∈ {0, 1} n . We want to estimate the size |S|/N up to relative error . It has been proven that this requires Ω 1 N |S| queries to U f for |S| = o(N ) [34,Theorem 1.13], where N = 2 n , for the success probability to be greater than 3/4, and this lower bound can be achieved using amplitude estimation [11].
We convert this problem into an eigenvalue problem of a block-encoded Hamiltonian. Let |u be the uniform superposition of the computational basis and D be the Grover diffusion operator defined in (3). Then define the following (n + 1)-qubit unitary (H is the Hadamard gate) which can be implemented using two applications of controlled-U f . We define Note that here H is given in its (1, 1, 0)-block-encoding U H . Let where the unit vectors |u 0 and |u 1 satisfy then it is easy to see that a = |S|/N . We only need to estimate the value of a to precision O( N/|S|) in order to estimate |S|/N to precision .
We analyze the eigenvalues and eigenvectors of H. It can be easily checked that {|u 0 , |u 1 } span an invariant subspace of H, and relative to this orthonormal basis H is represented by the matrix In the orthogonal complement of this subspace, H is simply the zero matrix. Therefore H has only two non-zero eigenvalues ±2a √ 1 − a 2 corresponding to eigenvectors The ground state of H is therefore |ψ + with ground energy −2a √ 1 − a 2 . We can use |u as the initial state, with an overlap ψ + |u = 1

Low-energy state preparation
It is known that estimating the spectral gap ∆ is a difficult task [3,16,5]. Our algorithm for finding ground energy, as discussed in Theorem 6, does not depend on knowing the spectral gap. However both of our algorithms for preparing the ground state in Theorem 4 and Corollary 7 require a lower bound of the spectral gap. We would like to point out that if we only want to produce a low-energy state |ψ , making ψ|H|ψ ≤ µ for some µ > λ 0 , as in [38], then this can be done without any knowledge of the spectral gap. In fact this is even possible for when the ground state is degenerate.
To do this, we need to first assume we have a normalized initial state |φ 0 with nontrivial overlap with the low-energy eigen-subspaces. Quantitatively this means for some γ, δ > 0, if we expand the initial state in the eigenbasis of H, obtaining |φ 0 = k α k |ψ k , then k:λ k ≤µ−3δ |α k | 2 ≥ γ 2 . Then we can use the block-encoded projection operator in (2) to get for some precision . Now we expand |ψ in the eigenbasis to get |ψ = k β k |ψ k , and denote |ϕ = k:λ k <µ−δ β k |ψ k . We then have, because of the approximation to the sign function, From the above bounds we further get Now denoting |ψ = |ψ / |ψ we can make ψ|H|ψ ≤ µ by choosing = O(γ 2 δ/α). Therefore the total number of queries to U H required is O( 1 δγ log( α δγ )) and the number of queries to U I is O( 1 γ ). From this we can see that if the initial state |φ 0 has a overlap with the the ground state that is at least γ, and we want to prepare a state with energy upper bounded by λ 0 + δ, the required number of queries to U H and U I are O( 1 δγ log( α δγ )) and O( 1 γ ) respectively. If we do not know the ground energy beforehand we can use the algorithm in Theorem 6 to estimate it first. Note that none of these procedures assumes a spectral gap.

Discussions
In this work we proposed an algorithm to prepare the ground state of a given Hamiltonian when a ground state upper bound is known (Theorem 4), an algorithm to estimate the ground energy based on binary search (Theorem 6), and combining these two to get an algorithm to prepare the ground state without knowing an upper bound a priori (Corollary 7). By solving the unstructured search problem and the approximate counting problem through preparing the ground state, we proved that the query complexities for the tasks above cannot be substantially proved, as otherwise the complexity lower bound for the two problems would be violated.
All our algorithms are based on the availability of the block-encoding of the target Hamiltonian. This is a non-trivial task but we know it can be done for many important settings. For example, Childs et al. proposed an LCU approach to block-encode the Hamiltonian of a quantum spin system [14], in which the Hamiltonian is decomposed into a sum of Pauli matrices. In [32], Low and Wiebe outlined the methods to construct blockencoding of Hubbard Hamiltonian with long-range interaction, and of quantum chemistry Hamiltonian in plane-wave basis, both using fast-fermionic Fourier transform (FFFT) [4]. The FFFT can be replaced by a series of Givens rotations which gives lower circuit depth and better utilizes limited connectivity [27,23].
We remark that the quantum circuit used in our method for ground energy estimation can be further simplified. The main obstacle to applying this method to near-term devices is the need of amplitude estimation, which requires phase estimation. It is possible to replace amplitude estimation by estimating the success probability classically. In the context of binary amplitude estimation in Lemma 5, we need to determine whether the success amplitude is greater than 3γ/4 or smaller than γ/4. This can be turned into a classical hypothesis testing to determine whether the success probability is greater than 9γ 2 /16 or smaller than γ 2 /16. A simple Chernoff bound argument tells us that we only need O(log(1/ϑ)/γ 2 ) samples to distinguish the two cases with success probability at least 1 − ϑ, as opposed to the O(log(1/ϑ)/γ) complexity in amplitude estimation.
In this approach, the only quantum circuit we need to use is the one in (2). The circuit depth is therefore only O((α/h) log(1/γ)). It also does not require the O(log(1/γ)) qubits that are introduced as a result of using amplitude estimation. These features make it suitable for near-to-intermediate term devices.
In [28] we proposed an eigenstate filtering method (similar in spirit to the method proposed in Section 3), and we combined it with quantum Zeno effect [12,9] to solve the quantum linear system problem. The resulting algorithm utilizes the fact that the desired eigenstate along the eigenpath always corresponds to the eigenvalue 0. In the setting of quantum Zeno effect based state preparation, in which we have a series of Hamiltonians and wish to incrementally prepare the ground state of each of them, our algorithm in Theorem 4 can be used to go from the ground state of one Hamiltonian to the next one, provided that we have a known upper bound for the ground energy. In the absence of such an upper bound, there is the possibility of using the algorithm in Corollary 7 to solve this problem. However in this setting we only want to use the initial state once for every Hamiltonian, since preparing the initial state involves going through the ground state of all previous Hamiltonians. This presents a challenge and is a topic for our future work.
It is worth pointing out that none of the Hamiltonians used in the proofs of lower bounds in Section 5 is a local Hamiltonian, and therefore our lower bounds do not rule out the possibility that if special properties such as locality are properly taken into consideration, better complexities can be achieved.
Thus we obtain the spectral gap in (6). To simplify notation we let λ = N 1/2−δ λ + . We then compute the ground state. We first find an eigenvector corresponding to λ − |χ = N δ ((N −δ + 2N −1/2 ) |u + (−2 + λ) |t ) We still need to normalize |χ . The normalization factor is Note that the third term under the square root comes from the overlap between |u and |t , and it does not play an important role asymptotically. Therefore normalizing we have the expression for the normalized eigenstate (5).