Quantifying Grover speed-ups beyond asymptotic analysis

Run-times of quantum algorithms are often studied via an asymptotic, worst-case analysis. Whilst useful, such a comparison can often fall short: it is not uncommon for algorithms with a large worst-case run-time to end up performing well on instances of practical interest. To remedy this it is necessary to resort to run-time analyses of a more empirical nature, which for sufficiently small input sizes can be performed on a quantum device or a simulation thereof. For larger input sizes, alternative approaches are required. In this paper we consider an approach that combines classical emulation with detailed complexity bounds that include all constants. We simulate quantum algorithms by running classical versions of the sub-routines, whilst simultaneously collecting information about what the run-time of the quantum routine would have been if it were run instead. To do this accurately and efficiently for very large input sizes, we describe an estimation procedure and prove that it obtains upper bounds on the true expected complexity of the quantum algorithms. We apply our method to some simple quantum speedups of classical heuristic algorithms for solving the well-studied MAX-$k$-SAT optimization problem. This requires rigorous bounds (including all constants) on the expected- and worst-case complexities of two important quantum sub-routines: Grover search with an unknown number of marked items, and quantum maximum-finding. These improve upon existing results and might be of broader interest. Amongst other results, we found that the classical heuristic algorithms we studied did not offer significant quantum speedups despite the existence of a theoretical per-step speedup. This suggests that an empirical analysis such as the one we implement in this paper already yields insights beyond those that can be seen by an asymptotic analysis alone.

Run-times of quantum algorithms are often studied via an asymptotic, worstcase analysis. Whilst useful, such a comparison can often fall short: it is not uncommon for algorithms with a large worst-case run-time to end up performing well on instances of practical interest. To remedy this it is necessary to resort to run-time analyses of a more empirical nature, which for sufficiently small input sizes can be performed on a quantum device or a simulation thereof. For larger input sizes, alternative approaches are required.
In this paper we consider an approach that combines classical emulation with detailed complexity bounds that include all constants. We simulate quantum algorithms by running classical versions of the sub-routines, whilst simultaneously collecting information about what the run-time of the quantum routine would have been if it were run instead. To do this accurately and efficiently for very large input sizes, we describe an estimation procedure and prove that it obtains upper bounds on the true expected complexity of the quantum algorithms.
We apply our method to some simple quantum speedups of classical heuristic algorithms for solving the well-studied MAX-k-SAT optimization problem. This requires rigorous bounds (including all constants) on the expected-and worstcase complexities of two important quantum sub-routines: Grover search with an unknown number of marked items, and quantum maximum-finding. These improve upon existing results and might be of broader interest.
Amongst other results, we found that the classical heuristic algorithms we studied did not offer significant quantum speedups despite the existence of a theoretical per-step speedup. This suggests that an empirical analysis such as the one we implement in this paper already yields insights beyond those that can be seen by an asymptotic analysis alone.

Introduction
There is growing motivation to design and evaluate quantum algorithms for commercial applications in order to assess the potential impact of quantum computers. To determine whether, or when, a quantum algorithm should be used for a task will involve comparing the candidate quantum algorithm, or set of algorithms, to an existing state-of-the-art classical one. A common approach to benchmark and compare algorithms is to consider their performances on worst-case instances, by providing upper bounds to their runtimes: bounds that hold for every possible instance of the problem the algorithm is designed to solve. In this context, a quantum speedup of a classical algorithm refers to the use of quantum algorithmic techniques that give an improvement over the worst-case runtime of the classical algorithm in question. In some cases, expected run-times are considered, where the expectation is taken over the internal (classical or quantum) randomness of the algorithm, and then upper bounds on this expectation are compared. In even rarer cases, it is possible to rigorously analyse the average-case complexity of the algorithms, where now the average is taken over the set of inputs [4].
However, such worst-case (and to a lesser extent, average-case) upper bounds can often be misleading: it is not uncommon for algorithms with a large worst-case run-time to perform very well in practice [22,33]. 1 For instance, this is especially true of heuristic algorithms, which are commonly used to solve real-world problems and are often fine-tuned to perform well on instances of interest, rather than on an artificial instance designed to be as difficult as possible for the algorithm but that will likely not appear in a natural setting. As such, much of quantum algorithms research focuses either on exponential quantum speedups (e.g. Shor's algorithm), in which case the speedup obtained by the algorithm is unambiguous; or when only a modest quantum speedup is available, on situations where the run-time of both the quantum and classical algorithms can be determined in a reasonably tight way: the squareroot speedup obtained by Grover's algorithm for unstructured search is a simple example of this. For more complicated (quantum and classical) algorithms, however, it might be that the run-time suggested by an asymptotic analysis fails to capture the true complexity of the algorithm on inputs that will be encountered in practice, which can make it difficult to determine the usefulness of a candidate quantum algorithm over a classical one. A similar observation was made in [21], where the authors point out that this disconnect is one of the main reasons that it is so difficult to design quantum algorithms for machine learning and assess their performance relative to their classical counterparts.
In this paper we continue along a line of work that moves beyond performing purely asymptotic analyses of quantum algorithms towards ones of a more empirical nature. In the time before large fault-tolerant quantum computers become readily available, we suggest that for the majority of quantum algorithms, an intermediate form of classical simulation + run-time estimation is possible, and that it can allow for meaningful and informative comparisons to be made. Our particular approach combines tight asymptotic analysis with classical simulation, in an attempt to carefully estimate the run-time of quantum algorithms in lieu of actually being able to run them on a quantum device. Importantly, our methodology is sensitive to the input given to the quantum algorithm.
To verify the utility of our approach, we perform such an empirical analysis for a set of reasonably simple quantum versions of a classical heuristic algorithm for a particular usecase: that of MAX-k-SAT. The quantum speedups we obtain are quite typical of quantum speedups of classical optimisation algorithms: the classical routine repeats a number of steps, the kth taking some time t k -which will depend on what happened in previous stepsuntil convergence; the quantum algorithm does the same, except with each step taking time now ≈ √ t k . To assess the usefulness of such a quantum algorithm, we must ask: to what extent does this square-root-like speedup manifest in the algorithm when it is actually run to convergence? Moreover, it is likely that the behaviour of the algorithm will differ substantially on different inputs, and hence a further question we should ask is: how much of a speedup does the quantum algorithm obtain on a representative or real-world input? We seek to answer such questions with our empirical approach.

Summary of results
Our main results and contributions are: • Improved analyses of upper bounds on the expected and worst-case complexities of Grover search when the number of marked items is unknown including log and constant factors, improving upon analyses performed in earlier works (e.g. [34,7]). We also consider how to optimize the number of classical samples drawn before Grover iterations are used. Sections 2.1, 2.2.
• Upper bounds on the expected complexity of a quantum maximum finding algorithm, improving upon those in previous works (e.g. [14,1]). Section 2.3.
• An estimation procedure that allows us to use the above bounds to obtain estimates of the expected run-times of repeated calls to a Grover search sub-routine when the number of marked items cannot be computed exactly (in our classical simulations), something that is useful (and indeed necessary) for bench-marking quantum algorithms on very large inputs. The outputs of the procedure come with theoretical guarantees. Section 3.
• A general approach combining the above that allows for rigorous (and efficient) classical estimation of the run-times of quantum algorithms that make repeated calls to Grover sub-routines. This is achieved via classical emulation of the underlying quantum algorithms.
• Two simple quantum heuristic algorithms for MAX-k-SAT, which are basic quantizations of classical 'hill climber' algorithms. Section 4.2.
• We find that the quantum hill climbers obtain favourable scaling compared to their classical counterparts, but that only one of them (the 'simple' hill climber) obtained an absolute speedup for the problem sizes we considered. We observe that some, but not all, of the per-step speedup indicated by an asymptotic analysis manifests in the final behaviours of the algorithms. Section 4.3.
• We verify that our estimation procedure does indeed yield accurate estimates of the expected run-times of our algorithms when compared to an exact method. Section 4.3.

Concurrent work
In a concurrent work [8], we apply our methodology to help design quantum algorithms for a common task in complex network analysis. The quantitative analysis employed in this other study is notably more comprehensive than the one employed for the elementary hillclimbing algorithm examined in this paper, and thus, the two studies are complementary: the former serves as an introductory exposition to the methodology, while the latter showcases its effectiveness when utilized for developing quantum algorithms for practical problems. More specifically, the other work studies the Louvain algorithm 2 , which forms one of the main tools for tackling a problem ubiquitous to the study of complex networks: that of community detection. Together with its descendants, the Louvain algorithm has successfully been used to study large sparse networks with millions of vertices [6,13,20,27]. In [8], we introduce several quantum versions of the Louvain algorithm, analyse their asymptotic (worst-case) complexities, and investigate numerically how they perform on randomly generated networks, as well as on real-world data sets.

A broader perspective
We remark that the kind of analysis we perform should in principle be possible for all quantum algorithms that achieve small polynomial speedups over classical ones: we can always 'simulate' the quantum algorithm by running its classical equivalent (which will be only polynomially slower!), and simultaneously estimate how long the quantum routine would have made if it were run instead. All that is required are appropriately tight bounds (including constants, etc.) on the run-times of the quantum sub-routines used by the algorithm. With all of the above in mind, the semi-empirical approach to practical quantum algorithm design and analysis that we use fits into a larger framework that follows the following structure: • Design a quantum algorithm or collection of algorithms, perhaps via speedup of an existing classical algorithm.
• Choose a measure of complexity for the algorithms, ideally one that is agnostic about the capabilities of future hardware 3 . This could be for instance the number of time-steps, or the number of queries to the input or to some function.
• 'Simulate' the quantum algorithms on inputs of interest by replacing the quantum routines with their classical counterparts, and instead collect information to estimate what the quantum complexities would have been if those sub-routines were used instead. This will require one to obtain or prove (ideally tight) bounds on the worst/expectedcase complexities of the quantum subroutines used by the algorithms.
• Use these empirical results to inform the choice or design of the quantum algorithms. For instance, one might observe that a particular quantum algorithm can be made faster in practice by simplifying it and sacrificing some asymptotic speedup.
As we will see in the sections that follow, there are further considerations that must be taken into account that can prove to be tricky even for simple algorithms such as the ones we consider, suggesting that such an analysis is unlikely to be entirely straightforward in general. Nevertheless, a very fruitful next step could be to build such 'pseudo-simulation' of quantum algorithms into any of the number of quantum programming languages now available [5,15,23,26,32], which might allow for these sorts of empirical analyses to be performed more quickly and painlessly, and hence facilitate faster quantum algorithm development and prototyping.

Methodology
Here we describe our methodology for comparing the run-times of quantum and classical algorithms. The most obvious way to do this is to run both algorithms on their respective devices and measure the time they take to run to completion. Unfortunately, quantum hardware is currently not sufficiently developed to be able to run any of the algorithms we describe, and therefore such a comparison is not possible at this point, and will likely not be for the foreseeable future. An alternative approach is to simulate the quantum algorithm at the qubit level on classical hardware, count the number of quantum gates applied and then compare this to the required number of classical gates. This is often the approach currently taken with heuristic quantum algorithms such as VQE [29,12,9] and QAOA [35,30]. However, such simulations are almost always going to be restricted to a few qubits, which means that a comparison between the classical and quantum algorithms can only be made for very small input sizes. Since we are interested in investigating how well the classical and quantum versions of our algorithms compare on actual datasets, which will generally be very large, this method of comparison is insufficient.
Moreover, since quantum technologies are still in their infancy, it is not unlikely that they will improve significantly over the coming years. With this prospect in mind, a comparison that depends heavily on the properties of current-day (or even current-day predictions of) quantum hardware might become obsolete in the near future. For this reason, we aim to make our comparisons architecture independent: this will in particular mean not explicitly counting the number of gates needed to implement the algorithms, or taking into account the overheads from error correction. Hence, our comparisons will be of a more qualitative nature than a quantitative one: we are interested, in principle, in how much of the speedup suggested by an asymptotic analysis manifests in the final behaviour of the algorithm -if no speedup appears at this stage, then it certainly won't appear after taking into account the aforementioned overheads.
In lieu of estimating actual running times for our algorithms, we fix a suitable notion of complexity and use this to directly compare the classical and quantum algorithms. In particular we opt to count the number of calls made to a particular function (in fact, the very function we are trying to maximise). This essentially equates to measuring query complexity, where we count queries to a function rather than to, say, the input. Counting the number of function calls of course does not capture every costly component of the algorithms that we consider: there are parts that add to the run-time but do not require function calls. However, as is common in the study of query complexity, we choose a suitable measure of complexity so that these parts are those for which we do not obtain any quantum speedup, and hence cost the same quantumly as they do classically -things such as updating what is stored in memory after completing a step of the algorithm. From the perspective of quantum speedups, comparing the number of function calls made by the quantum and classical algorithms can therefore serve as a proxy for how much of a speedup we can expect to gain on the part of the classical algorithm that admits a speedup. Finally, we note that choosing this complexity measure preserves the architecture independence that we strive for in our analysis, by, for example, ignoring precisely how long a memory update takes, or how many items can be retrieved from memory in a single computational 'step'.
For simplicity, we focus our attention on quantum algorithms that are composed of a number of steps, each of which consists of some classical computation as well as one or more calls to a Grover search on a list containing an unknown number of marked items, and/or to a quantum maximum-finding sub-routine 4 . The quantum algorithms for MAX-SAT discussed in Section 4 are examples of such an algorithm. We consider situations in which the list itself and the number of marked items in it will differ for each step, and in particular will depend on the outcome of the calls that came before it, making the behaviour of the algorithm sensitive to the input itself, as well as the (possibly random) outcomes of the processing during each step. Precisely, we consider quantum algorithms with the structure of Algorithm 1. Do some classical processing on X and M , resulting in some list L k containing t k marked items.

4:
Perform either one or more (perhaps nested) Grover searches with an unknown number of marked items on L k , or run quantum maximum-finding on the list L k , to obtain some item x k .

5:
Do some more classical processing given x k , update M .

6: end for
In order to estimate the run-time of such an algorithm given some particular input we would, following our approach, execute all steps except step 4 of Algorithm 1 as they would normally be executed classically, but then replace the step 4 with its classical alternative, and instead estimate how long it would have taken if the quantum routine were called. In this way, we can estimate the run-times for different inputs of any quantum algorithm that follows this basic structure.
As we will see in Section 2, even to estimate the complexities of algorithms that make use only of Grover search and quantum maximum finding already requires a somewhat substantial effort. There we prove rigorous upper bounds (including constants) on the expected and worst-case query complexities of Grover search with an unknown number of marked itemssomething that, to our knowledge, has not been done elsewhere. 5 . Using these bounds, we obtain bounds on the expected complexity of quantum maximum finding, improving upon previous results from the literature.
Our approach can of course be extended to more complicated algorithms that make use of different quantum sub-routines by proving analogous bounds for those sub-routines. Here, however, we keep our focus narrowly on quantum algorithms of the simple structure described above, so that we can apply and demonstrate the usefulness of our approach.
Finally, we note that the run-time estimates for the steps in line 4 will depend on the number t k of marked items in the list L k ; however, it might well be that we don't know how many marked items are there during any one step, and moreover this could be prohibitively time-consuming to compute. In such a situation we may be forced to estimate how many marked items there are, and this will introduce some error into our run-time estimates that we have to handle carefully. For instance, an unbiased estimate of the number of marked items in a list can give us a biased estimate for the run-time of Grover search -we discuss this and other considerations in more detail in Section 3.

Previous work
There have been an increasing number of papers that perform precise resource estimates for a number of quantum algorithms, mostly with a focus on algorithms for simulation of physical systems [28,18,31]. Others, such as [3], have investigated what impact overheads such as error-correction might have on potential quantum speedups, in this case concluding that, at least in the near-term, quadratic or small polynomial speedups are unlikely to manifest in practice. Finally, Campbell et al. [11] performed a rigorous analysis of the potential speedups achievable by quantum algorithms for solving constraint satisfaction problems. They considered upper bounds on the run-times of both a naive application of Grover search as well as an optimized implementation of a more sophisticated quantum algorithm for backtracking due to Montanaro [19], taking into account realistic properties of near-term as well as future hardware. They then compared these run-times to the performance of state-of-the-art classical algorithms in an effort to understand when quantum algorithms might provide a performance advantage, and what resources would be required for this.
Our current work is similar in that we also use rigorous upper bounds on the complexities of our quantum sub-routines, although we are often more interested in expected complexities. Moreover, we attempt to perform an analysis that is architecture independent, whereas Campbell et al. were interested in hardware properties. We also consider quantum algorithms whose run-times cannot be analysed ahead of time, and which must be implemented, or simulated, in order to discover the speedups (or lack thereof) that they might achieve in practice.
There have also been works that aim to prove rigorous upper bounds on the complexities of various Grover search routines. For example, Zalka [34] performed a careful analysis to upper-bound the number of Grover iterations performed in the worst-case on a list with an unknown number of marked items. We make use of this result, and improve upon it, by extending the analysis to consider the expected number of queries made by the algorithm, which requires substantially more effort to bound (tightly). Finally, we note that our analysis holds for all input sizes, whereas (as we understand it) Zalka's result applies only in the limit of large input size.
More recently, an arxiv preprint [24] appeared that claimed that Grover's algorithm offers no quantum advantage. That paper explores the question of whether one would expect a speedup from applying Grover search in a different way than we do. We consider the speedup obtained by Grover vs. it's classical counterpart -i.e. brute-force search using the same query oracle that Grover has access to. They consider the question of whether Grover's algorithm itself can be classically simulated in practice whilst retaining the square-root speedup, which essentially boils down to studying the difficulty in classically simulating coherent calls to the oracle. It would be interesting (but far beyond the scope of this work) to add this approach to the toolbox when studying whether one can expect to obtain a speedup in practice via application of Grover-type quantum algorithms. The remainder of their paper considers the effects of noise on the performance of Grover's algorithm, which we explicitly avoided in our analysis.

Organization
We begin in Section 2 by explicitly describing an implementation of Grover search with an unknown number of marked items followed by an implementation of a quantum maximum finding routine. We then derive tight upper bounds, including all constants, for the expected and worst-case complexities of these quantum sub-routines. In Section 3, we consider how to apply these bounds for a particular input without knowing ahead of time the parameters needed to compute them, and propose an estimation procedure that deals with this uncertainty. Finally, in Section 4, we apply our methodology to the use-case of MAX-SAT, and present our numerical results.

Query complexity bounds
In this section and the next we introduce the tools that form the backbone of our methodology for estimating the run-times of quantum algorithms, in the sense described above. The two main tools that we will require are: a set of rigorous upper bounds on the expected-and worst-case query complexities of Grover search with an unknown number of marked items and quantum maximum-finding (Section 2), and some technical results that allow us to estimate these complexities even when the exact number of marked items is unknown to us (Section 3).
As mentioned, our main quantum sub-routine will be a Grover search with an unknown number of marked items -which we shall refer to as QSearch -that can find and return a marked item from a list L of length |L| using an expected O We will also find the following variant of Grover search useful in proving our bounds.
Lemma 2 (Exact Grover search [16] Finally, we will make use of the quantum maximum-finding algorithm of [14] Lemma 3 (Quantum maximum-finding [14]). Let L be a list of items of length |L|, with each item in the list taking a value in an ordered set, to which we have coherent access in the form of a unitary that acts on basis states as Then there exists a quantum algorithm QMax(L, ϵ) that will return arg max x L[x] with probability at least 2/3 using at most O( |L|) queries to O L (i.e. to the list L) and O( |L| log |L|) elementary operations. By repeating the algorithm log(1/ϵ) times, the probability of success can be amplified to 1 − ϵ.
In the sections that follow we carefully bound the expected and worst-case query complexities, including all constants, of QSearch on a list with an unknown number of marked items, and then of QMax. We consider two different implementations of QSearch, one that performs better in the expected case, the other better in the worst case.
The implementation of QSearch uses both queries to the function g (classical queries) and the oracle O g (quantum queries). Typically, the oracle O g can be constructed from a reversible classical circuit implementing g, in which case a single query to O g will generally require two queries to g (to compute and uncompute garbage). When we refer to queries, we will always mean queries to g itself. A query to O g will then correspond to potentially multiple queries to g, and will be weighed with a constant c q denoting the number of queries made to g per query to O g . Generally speaking, and for the case of MAX-SAT discussed in Section 4, c q = 2 as mentioned above.
For notation, we will use E to denote an expected number of queries to g, and W for worst-case, with the name of the algorithm in question in the subscript. E.g. if we run QSearch on a list L of length |L| with t marked items and a success probability of at least 1 − ϵ, then the number of queries will be denoted by E QSearch (|L|, t, ϵ) in the expected case, and by W QSearch (|L|, ϵ) in the worst-case.

Expected query complexity of QSearch
Our implementation of QSearch is based on the implementation of Boyer et al. [7], which takes as input a list L of length |L| and a unitary/oracle that gives coherent access to the function g : L → {0, 1}. Let t = |{x ∈ L : g(x) = 1}| be the unknown number of marked items in L, which is assumed to be ≤ 3|L|/4. We first introduce the algorithm QSearch ∞ , which consists of the following five steps: 1. Set λ = 6/5, and initialize 6 m = λ.
2. Choose j uniformly at random from the set of non-negative integers smaller than m.
3. Apply j Grover iterations to the uniform superposition of all items in the list 4. Observe the list register.
5. If the observed item is marked, return it and exit; otherwise, set m = min(λm, |L|), and go back to step 2.
Note that QSearch ∞ will always find a marked item if there is one, but will run forever if there are no marked items. To obtain an algorithm with a finite stopping time one can add an appropriate time-out, in which case the algorithm has some probability of failing (reporting no marked items when there are in fact some). Boyer et al. note that the case t > 3|L|/4 can be disposed of in constant time using classical sampling. In order to simulate the behaviour of above algorithm numerically, we must include the classical sampling and time-out features explicitly.
In fact, in Appendix A.1 we show that it is not necessary to assume 0 < t ≤ 3|L|/4, and therefore the classical sampling part is optional. However, in case of many marked items, drawing classical samples is more efficient than applying Grover iterations, and for this reason we keep the classical sampling phase in our implementation of QSearch below, with the number of classical samples N samples as a hyperparameter. We discuss how to pick an optimal N samples in Section 2.1.2.

Implementation
We now give our implementation of QSearch(L, N samples , ϵ) Here, L is the list that contains marked and unmarked items, N samples is the number of classical samples we take and 1 − ϵ is the required lower bound for the success probability. We also define λ = 6/5 and α = 9.2.
Our implementation works as follows. After sampling at most N samples items from L classically, we execute N runs Grover runs, where a single Grover run is an application of QSearch ∞ with a time-out, given by Q max = α |L| queries, i.e. lines 8 -17 of Algorithm 2. The number of runs depends on the desired success probability: N runs = ⌈log 3 (1/ϵ)⌉. A single application of QSearch ∞ with timeout consists of several Grover cycles. That is, for a single run of QSearch ∞ with timeout, we first initialize m = λ, and then repeatedly (i) pick an non-negative integer j less than m, (ii) do j Grover iterations and (iii) measure, and if we don't find a marked item we increase m by a factor of λ. Steps (ii) -(iii) will be referred to as a Grover cycle, see Algorithm 3. Finally, a single application of the Grover iterate 7 will be referred to as a Grover iteration.
In Appendix A (precisely Appendices A.1-A.3) we show that QSearch, as given by Algorithm 2, has the properties stated in the lemma below.

Algorithm 2 QSearch
while r < N runs do 8: Sample a non-negative integer j less than m uniformly at random 10: while Q sum + j ≤ Q max do 11: y ← GroverCycle(L, j) 12: if y is marked then 13: return y 14: else 15: Sample a non-negative integer j less than m uniformly at random 18: end if 19: end while 20: r ← r + 1 21: end while 22: return No marked item found Sample an element x from L uniformly at random 5: if x is marked then 6: return x 7: end if 8: k ← k + 1 9: end while 10: return No marked item found 11: end function 1: function GroverCycle(List L, non-negative integer j)

2:
Prepare uniform superposition over all elements of L 3: Do j Grover iterations

4:
Measure list-index register 5: return measurement outcome 6: end function Lemma 4 (Worst-case expected complexity of QSearch). Let L be a list, g : L → {0, 1} a Boolean function, N samples a non-negative integer and ϵ > 0, and write t = |g −1 (1)| for the (unknown) number of marked items of L. Then, QSearch(L, N samples , ϵ) as described by Algorithm 2 finds and returns an item x ∈ L such that g(x) = 1 with probability at least 1 − ϵ if one exists using an expected number of queries to g that is given by with If no marked item exists, then the expected number of queries to g equals the number of queries needed in the worst case (denoted by W QSearch (|L|, N samples , ϵ)), which is given by (4) In the formulas above, c q is the number of queries to g required to implement the oracle O g |x⟩ |0⟩ = |x⟩ |g(x)⟩, and α = 9.2.
Note that our obtained expression for E QSearch (|L|, t, N samples , ϵ) is actually independent of ϵ for t > 0 because our upper bound for E Grover (|L|, t) is independent of ϵ, which is a consequence of the fact that we don't have a lower bound on the failure probability; see Appendix A.3 for details. Also, for the case 1 ≤ t ≤ |L|, even though it might appear to be, the expected number of queries for the classical sampling part is actually not linear in |L| contains a factor of t |L| .

Optimizing N samples
As mentioned in the beginning of Section 2.1, the number of classical samples we use for QSearch is a hyperparameter, and in this subsection we discuss how it can be optimized and set to improve the performance of the algorithm for different inputs. Classical sampling requires fewer queries than Grover search does when a large fraction of the items is marked, whereas it is more efficient to not use classical sampling at all when a small number of them is marked. When a fraction f of items are marked, then an expected 1/f classical queries will be required to find one. To determine when classical sampling is more efficient than quantum, we can compare this quantity with the number of queries made by Grover search. That is, given a list L of size L, we want to find out for what value of f we have The fraction where equality is attained we call f 0 . When f < f 0 , the Grover part requires fewer queries than classical sampling does, and therefore setting N samples > 0 (i.e. turning the classical sampling part on) increases the query count compared to having N samples = 0.
We can numerically compute f 0 (|L|) for different values of |L|. We observe the following.
• For |L| ≥ 260, the value for 1/f 0 that makes the rightmost inequality in Eq. (5) an equality is plotted as a function of |L| in Fig. 1a. In practice, we do not know what the fraction of marked items f = t/|L| is. For certain algorithms, we might have some information about what f can be, and in such cases this information can be leveraged to our advantage (see e.g. [8]). In case we have no prior knowledge on the number of marked items at all, we can assume every value of t is equally likely. In this case, the expected number of queries for QSearch is given by Numerically minimizing 8 the expression above as function of N samples for small list sizes yields the graph in Fig. 1b

Worst-case query complexity of QSearch
If we make use of a slightly different implementation of QSearch described by Zalka [34], we can obtain a tighter bound on its worst-case performance, which will be useful for some of our applications of QSearch where we only care about the worst case (since the expected complexity of this variant is actually worse than that of the QSearch implementation described in the previous section). Unfortunately, the bounds given in [34] are only asymptotic and ignore, for example, extra constants arising from rounding integers, and so we briefly re-derive them below. The main upshot is that the dependence on the error becomes quadratically better than the usual implementation, which could end up being a significant improvement for our algorithms. To distinguish this implementation of QSearch from the one outlined in Section 2.1, we will refer to this quantum sub-routine as QSearch Zalka .
As usual, let L be the list of items over which we are searching, and suppose that t of them are marked, and that we want to succeed in finding one if it exists with probability ≥ 1 − ϵ. The algorithm, based on the one described in [34], consists of the following steps: 1. A preliminary step that checks for a small number of marked items, by systematically ruling out t = 1, t = 2, . . . , t = t 0 for (for reasons made clear in Appendix A.4) t 0 = ⌈ ln ϵ 2 ln(3/4) ⌉, by running exact Grover search for each value of t. If this step finds a marked item, we return it and stop.
2. A second step where (now with the knowledge from the first step that t 0 < t) we repeatedly choose an integer j uniformly at random from the range [0, ⌈ π 4 |L| t 0 ⌉] and then run Grover search using j iterations. This is done 2t 0 times (which minimises the part of the complexity that depends on |L|). If a marked item is found during any run, we return it and stop. Otherwise, the algorithm returns 'no marked item'.

Run-time analysis
The worst-case run-time is clearly when there are no marked items, and hence both steps above are run to the end. In Appendix A.4 we prove the following lemma.
Lemma 5 (worst-case complexity of QSearch Zalka ). Let L be a list of items, g : L → {0, 1} a Boolean function and ϵ > 0, and write c q for the number of queries to g required to implement the oracle O g |x⟩ |0⟩ = |x⟩ |g(x)⟩. Then, with probability of failure at most ϵ, QSearch Zalka requires at most queries to g to find a marked item of L, or otherwise to report that there is none.

Quantum maximum finding QMax
We use the quantum maximum finding algorithm from Ahuja and Kapoor [1], described below. We then provide an improved analysis 9 for the expected number of queries made by their quantum maximum finding algorithm. The input of the algorithm is once again a list L, together with a function R : L → R that assigns a value to each item. The output is the index of an element of L that maximises R.
We assume we have coherent oracle access to the following marking function f defined by i.e. access to unitaries O f i that acts as

Infinite-time algorithm
To start with, we define a zero-error infinite-time 10 algorithm for finding the maximum. Afterwards, we will incorporate a time-out in the infinite algorithm, and use Markov's inequality to turn the infinite algorithm into a bounded-error algorithm that terminates in finitely many steps.

Algorithm 4 QMax
Choose i ∈ L uniformly at random and set y = R(i). 3: while True do 4: Apply QSearch ∞ to the list L with the marked items being f −1 y (1).

5:
Update y = R(j), where j ∈ L is the item found by QSearch ∞ .
6: end while 7: return y 8: end function We say that QMax ∞ has found the maximum when y in Algorithm 4 is equal to an item that maximises R. In Appendix B.1, we prove the lemma stated below.

Lemma 6.
[Expected complexity of QMax ∞ ] Let L be a list of |L| items. Then, the expected number of queries to any of the f i (as defined as in Eq. 7) required for QMax ∞ to find the maximum of L is upper bounded by where F (|L|, t) is defined by Eq. (3). Here, c q is the number of queries to f i required to implement the oracle O f i (which we assume to be the same for all i).
In case we are interested in queries to R rather than the f i , then we note that the total number of queries to any of the f i combined is equal to the total number of queries to R (since for each f i we need to compute R(i) only once, and this we do anyway at the end of every Grover run to check if the found item is marked), except at the very beginning, where in line 2 of Algorithm 4 we need to compute R(y). Therefore, the number of queries to R is upper bounded by Eq. (9) plus one. Consequently, when running QMax ∞ a total of T times, if we switch from upper bounding queries to any of the f i to queries to R, we need to add T to our obtained bound for the former to obtain a bound for the latter.
Next, we can use the bounds for the expected number of queries made by QSearch ∞ to bound the expected number of queries for QMax ∞ . Using the upper bound in Eq.
Hence, we obtain If the list L is not too large, we can compute the above upper bound by evaluating the sum explicitly. If this computation becomes too time-consuming, we can also resort to bounds that are easier to evaluate. Two such bounds are derived in Appendix B.2, and are given below.
We have the following loose upper bound as well as a tighter upper bound, given by |L| + ln(|L|/4) 2 ln(6/5) ln(|L|/3) + ln(|L|/4 + 1) where Li 2 is Spence's function, also known as the dilogarithm. For the second bound the leading order term in |L|, has a smaller coefficient than the first upper bound has.

Finite-time bounded-error algorithm
Next, we can introduce a timeout Q timeout = 3E QMax ∞ to make QMax ∞ a finite-time boundederror algorithm. If X is the random variable corresponding to the number of queries to g made by QMax ∞ in order to find the maximum of the list L, then by Markov's inequality, resulting in a maximum-finding algorithm that finds the maximum with probability at least 2 3 and uses an expected number of queries that is upper-bounded by Q timeout .
We can further boost the success probability to 1 − ϵ by repeating the above log 3 (1/ϵ) times, and picking the largest element (with respect to the function R) out of all repetitions, to obtain the algorithm QMax that succeeds with probability at least ϵ and makes at most ⌈log 3 (1/ϵ)⌉Q timeout queries in expectation.
Corollary 1 (Expected complexity of QMax). Let L be a list of items of length |L|. Let f i be the marking functions as defined in Eq. 7. Then the expected number of queries to f i (for any i) required for QMax to find the maximum of L with success probability at least 1 − ϵ is

Estimating complexities under uncertainty
To estimate the query complexities of QSearch and QMax, we can use the bounds derived in the previous sections for their expected-and worst-case complexities. These bounds take as input a list L, the desired success probability of the sub-routine, and in case of QSearch also the number t of marked items in L. However, the number of marked items in the list will not be known ahead of time, and moreover could be computationally time-consuming to compute classically, especially for very large inputs, which will likely be the ones for which we want to estimate the run-times of quantum algorithms. Additionally, for algorithms of form of Algorithm 1 that make use of repeated calls to QSearch or QMax, we would like that with high probability every call to either QSearch or QMax succeeds, which will require boosting their success probabilities to something (inversely) proportional the number of times they are run. However, the number of times each sub-routine is called will often not be known until the algorithm has finished executing, and therefore we will require a reliable upper bound T to the number steps, i.e. the number of times such calls are made.
In this section, we discuss how to deal with both quantities. First, in Section 3.1, we discuss how to use a sampling procedure to estimate the number of marked items using sampling, and consider the extra complications that arise when we use such estimated values to compute (bounds on) the complexities of algorithms. Next, in Section 3.2, we discuss how the total number of steps affects the accuracy of both QSearch and QMax, as well as the accuracy of the estimates for the expected number of queries made by these quantum routines as obtained through the sampling procedure discussed in Section 3.1.

Estimating the number of marked items
To use the bounds on the number of queries made by QSearch derived in Section 2.1, we need to know how many marked items there are in the list given as input to a QSearch call. For sufficiently small lists, we can count the number of marked items exactly at reasonably little computational cost. For longer lists this will become very time consuming, leading to exceedingly slow simulations. When determining the number of marked items exactly becomes infeasible, we can instead estimate the number of marked items. We can do this by counting the number of samples l we need to draw on average before we find a marked vertex.
For a list L with t marked items, the probability that an element of L chosen uniformly at random is marked is f = t/|L|. Consequently, the probability that we find a marked item after randomly choosing (with replacement) k ∈ N elements of L (the first k − 1 elements not being marked) is given by (i.e a geometric distribution with parameter f ). We write l ∼ Geo(f ) to denote a random variable sampled according to such a distribution, and throughout this section, when we take the expectation value over l it is implied that we do this over the geometric distribution, i.e.: for any function X : N → R. By sampling l ∼ Geo(f ), we obtain an unbiased estimate of which we can use to approximate the expected number of queries made by QSearch if it were run on L.
To start we focus on our upper bound for the expected number of (quantum) queries (E Grover ) to the oracle O g made by QSearch. In order to estimate (an upper bound for) E Grover (|L|, t), we would like an estimator E estimator Grover such that the procedure (i) sample l ∼ Geo(f ), and then (ii) plug the result into our expression for E estimator Grover (l) gives, in expectation (over l), an upper bound to E Grover (|L|, t); i.e. we want E[E estimator Grover (l)] ≥ E Grover (|L|, t). A naive attempt at constructing E estimator Grover would be to take our upper bound on E Grover (|L|, t) from Eq. (2) and in this expression replace 1/t by l/|L|. However, from Eq. As a consequence, the procedure outlined above (in expectation over l) does not give an upper bound to the expected number of queries; instead we obtain a biased estimator that underestimates our upper bound for E Grover . Note that the issue of concavity will arise in any approach that tries to simulate a Grover search on an unknown number of marked items by using classical sampling to estimate the fraction of marked items.
We now discuss how to deal with the concavity of the square-root and logarithm, and then describe an estimator that always upper bounds E Grover in expectation.

Upper bound for square-root and log estimates
We prove the following two lemmas in Appendices C. 1  (e γ l) (13) upper bounds E Grover in expectation for all 1 ≤ t ≤ |L| (or 1 ≤ 1 f ≤ |L|): upper bounds E QSearch in expectation: for 1 ≤ t ≤ |L| and for all 11 ϵ > 0, where the expectation value is taken over the geometric distribution l ∼ Geo(f ), with f = t/|L|.
The above estimator applies to the situation where there is at least one marked item. However in case t = 0, in order to determine l we keep drawing samples (with replacement) indefinitely. To make sure our algorithm terminates in finite time, we need to set a maximum l max such that, if l = l max , we conclude that t = 0 and we stop the sampling procedure. The particular choice of l max will depend on our tolerance for falsely detecting no marked items.
In practice, we use the procedure Estimate QSearch (L, N samples , δ, ϵ) described in Algorithm 5 for estimating (an upper bound to) the expected number of queries to g made by QSearch.
Recall that c q is the number of queries to g required to implement the oracle O g .

3:
Draw samples uniformly at random (with replacement) from L until either finding a marked item, or making l max samples. Let l be the number of samples taken.

4:
if l ≤ N samples then 5: Then a marked item would have been found classically, in which case E ← l . 6: else if N samples < l ≤ l max then 7: Then the marked item would not have been found classically, and some Grover iterations would have been performed. In such a case, (e γ l) . 8: else if l > l max then We conclude t = 0, and therefore, by Eq. (4), E ← N samples + 9.2c q ⌈log 3 (1/ϵ)⌉) |L| . Proof. This follows from Lemma 9, except now we have to take into account the possibility that we sample l = l max even when t ≥ 1. Recall that in Algorithm 5 we set l max = ⌈ |L| δ ⌉. Assuming the worst-case of f = 1/|L|, by Markov's inequality, this can happen with probability at most This lemma gives a very rough upper bound on the failure probability as it does not take the fraction of marked elements into account. In practice we found that setting δ = 1 100 worked well. Also note that, to get more accurate estimates, rather than sampling l and plugging the expression into H, we can also sample l multiple times and take the sample average of all the corresponding H-values. This will however require a somewhat more elaborate failure probability analysis in case some of the sampled l's are equal to l max .

Unknown number of steps
When running an algorithm of the form of Algorithm 1, we require that all calls to the quantum subroutines QSearch or QMax to succeed in order to guarantee that the final algorithm worked correctly. This requires boosting their success probabilities, and, in order to do so, we need to know how many times each one is called, which in our case means knowing how many steps the overall algorithm will take. In case of heuristic algorithms, the number of steps is usually not known. However, it is not uncommon for heuristic algorithms that their typical behaviour on certain practical problem instances is known -which is the case for, for example, MAX-SAT and community detection [8].
In case nothing is known about the number of steps, then instead we can (i) guess an upper bound to the number of steps, (ii) run the classical algorithm that emulates the quantum algorithm, and (iii) check retro-actively if indeed we required fewer steps than the guessed upper bound. If our guess was too low, then optionally we can increase it and repeat.
Suppose that T is such a (guessed) upper bound to the total number of steps of an algorithm of the form of Algorithm 1, meaning that we call QSearch or QMax at most T times. By the union bound, given a desired probability of failure of at most ϵ total , the accuracies of the individual subroutines ϵ subroutine should be set such that The above formula in fact holds for the accuracy of the quantum subroutines (denoted by ϵ in the sections above), as well as the parameter δ (which determines l max ) in Lemma 10 for the procedure Estimate QSearch (L, N samples , δ, ϵ), since this estimation procedure is called once per step of the (classical simulation of the) algorithm.

Use-case: max-k-sat
In this section, we take the tools developed in Sections 2 and 3 and apply them to a particular heuristic, called a hill-climber, for finding (approximate) solutions to Boolean satisfiability problems. The algorithm discussed is of the sort that it admits a quantum speedup that is of the form of Algorithm 1. This section achieves the modest goal of numerically confirmingthat the proposed framework works, but is limited in its depth; for a more comprehensive and detailed numerical study (of a different computational problem) we would like to refer the reader to [8] (see also Section 1.2).

Propositional Boolean Satisfiability (k-SAT)
k-sat is a fundamental problem in computer science and artificial intelligence, in which we ask whether a satisfying assignment exists for a given Boolean formula in conjunctive normal form, with the property that each clause contains at most k literals. Whilst k-sat is an example of a decision problem, max-k-sat is an optimization problem that generalizes k-sat: it is the problem of determining the maximum number of clauses, that can be made true by an assignment of truth values to the variables of the formula. Let x ∈ {0, 1} n be bit strings of length n, C = {C i } m i=1 be a set of m clauses, which each act on at most k literals, and . This problem is NP-hard for any k ≥ 2. A straightforward heuristic for solving max-k-sat instances is based on hill-climbing: the general idea is to start with some initial bit string, and then look for incremental improvements in the direct neighbourhood of this given bit string. This process is repeated iteratively until it has converged to some local maximum or the maximum number of iterations is reached. Hill-climbing belongs to the family of local search methods in mathematical optimization. Local search heuristics have been widely studied for SAT and MAX-SAT (see Ref. [25] for an extensive review on local search methods) and are also yet still being studied: see Refs. [2,10] for some more recent works.
For max-k-sat , we define the d-level neighbourhood N d (x) of some bit string x as the set of all other bit strings that differ from x in at most d bit flips. The total size of this space is given by For our hill climber heuristic, we either consider a simple hill climber, which greedily moves to an arbitrary neighbouring bit string with a strictly larger objective function value, and the steep ascent hill climber, which computes φ on all bit strings in the neighbourhood of the current bit string and picks the one that maximises the increase in φ (assuming its objective function value is strictly larger than that of the current bit string. If we write T for the number of moves made by either the simple or the steep ascent hill climber (which in general will require different number of steps depending on the problem instance, and in case of the simple hill climber also on the internal randomness of the algorithm) the worst-case time complexities of both algorithms have similar mathematical expressions, given by because the per step complexities have the same worst-case upper bounds. However, in practice the expected run-time of the simple hill climber depends on the instance and the current state of the algorithm: the more bit strings in its neighbourhood increase the objective function value the faster it completes its local search step in expectation. If, at step t of the algorithm, we write f d,t for the fraction of the number of bit strings in N d (x t ) for which φ assumes a value larger than φ(x t ), i.e.
then we can bound the expected number of steps for the simple hill climber by

Quantum heuristics for max-k-sat
Both variants of the hill climber search routines lend themselves to be sped up easily by Grover implementations.
To start with, given a bit string y, we define the function We assume that, for every bit string y ∈ {0, 1} n , we have oracle access to O fy .
Lemma 11 (Simple quantum hill-climber). Let φ be a max-k-sat instance on n variables, and assume oracle access to each of the O fy as described above. Then there exists a quantum algorithm Simple quantum hill-climber that with probability ≥ 2/3, behaves identically to a classical simple hill climber and requires at most an expected number calls to φ.
Proof. We pick an initial bit string as we would with the classical simple hill climber. Next, suppose that in step t of the algorithm our current best bit string is x t . Here, we replace the local search step of a d-level neighbourhood in the aforementioned classical simple hill-climber by a single call to QSearch using the oracle O fx t . Writing f d,k for the fraction of neighbours of x t for which the objective function value is strictly larger than φ(x t ), by Lemma 1 we require at most an expected number O( 1 f d,t log(1/ϵ)) queries to O fx t and O( 1 f d,t log(n d /ϵ)) other elementary operations to find such a candidate x t+1 with probability at least 1 − ϵ. If we set ϵ = 1 − T 2 3 , our overall success probability will be at least (1 − ϵ) T = 2/3, as required. Since each query to any of the O fx t 's requires O(1) queries to φ, the lemma statement follows.

Lemma 12 (Steep quantum hill-climber). Let φ be a max-k-sat instance on n variables, and assume oracle access to each of the O fy as described above. Then there exists a quantum algorithm Steep quantum hill-climber that with probability ≥ 2/3, behaves identically to some classical steep hill climber and requires at most an expected number
Proof. The proof is similar to the proof of Lemma 11, except that in this case, instead of the classical maximum finding routine, we make use of the quantum subroutine QMax of Lemma 3, which also requires access to each of the O fx t 's. We can find the maximum in O( √ n d ) log(1/ϵ) queries to each of the O fx t 's with probability ≥ 1 − ϵ. We set ϵ in the same way as we did for the simple quantum hill climber to get the desired success probability.

Numerics
In this section, we describe our numerical implementations of the classical and quantum versions of the steep and simple hill climbers. We then compare the expected number of queries for the quantum and classical versions when applied to typical problem instances of max-k-sat using the method developed in Sections 2 and 3.

Algorithmic implementations
Classical hill climbers Both classical algorithms are allowed to sample without replacement when searching for a good (or the best) element. Therefore we have that, in the case of a simple hill climber, the number of samples X d,t when searching over a list of |N d (x t )| elements at step t, of which a fraction of f d,t are 'good elements', has an expected value given by For the steep hill climber, the classical expected number of queries at every step is always equal to |N d (x)|.

Quantum hill climbers
For the quantum algorithms, we set the desired failure probability ϵ for the entire algorithm to be at most 10 −5 , which can be achieved by setting the accuracy per step to ϵ/T , with T the maximum total number of steps. Empirically, we found that T = n provides a very loose upper bound on the total number of steps taken by the algorithm. Note that the value of T could be optimised more thoroughly -this leads to a smaller total number of queries needed in the quantum setting -but we leave this for now as this is beyond the main goal of this case study.
For the simple hill climber we use two implementations, one that calculates the number of marked elements t (in this case marked elements correspond to possible moves that increase the objective function value) exactly at every step, and one that acquires only an estimate of this via sampling. The exact implementation keeps track of the list of all marked elements at every step, which allows us to use our sharper bounds from Lemma 4 in Section 2.1 to upper bound the expected number of queries made by QSearch at each step of the algorithm. From this list of marked elements, we select an element at random and use that as an update step for the classical simulation.
The sampling algorithm, just like its classical counterpart, instead samples in search of elements that give an increase in the cost function. When it finds one, use the number of tries l it took to find a marked item as input to estimate (an upper bound) to the expected number of queries QSearch would have made, as described in Section 3, to estimate the run-time of the quantum algorithm. This procedure is just an implementation of Algorithm 5 for the case of max-k-sat .
The steep ascent hill climber also keeps track of the complete list of marked items at every step. From this it selects the item with the maximal function value increase. It uses our bounds from Section 2.3 to attain estimates of the expected number of queries QMax would have made for every step, in order to estimate the run-time of the entire algorithm.

Numerical implementation
We write the problem as a matrix multiplication problem and use numpy to solve it, which allows for larger instances to be tested. The assignment of truth values x ∈ {0, 1} n is written as a vectorx ∈ {−1, 1} n where −1 is assigned to variables that are false and 1 to those that are true. The clauses C can be written in a similar fashion,C i ∈ {−1, 0, 1} n , where −1 is assigned to the negated variables, 0 is assigned to the variables that are not in the clause, and 1 to those that should be true according to the clause. We construct a matrix A with thẽ C i 's as rows. This matrix has an efficient sparse representation since most of it's entries are 0. The objective function ϕ(x) becomes the following: where W T is row vector containing the weights for each clause and k is the number of variables per clause. The addition and division of k is elements-wise, while Ax is matrix vector multiplication. Note that −k ≤ Ax ≤ k, where the left-hand inequality is only attained when all variables are incorrectly assigned. In that case the ceiling function returns a 0 and in all other cases it returns a 1, as required. In all numerical simulations d (which determines the level of the neighbourhood considered) is set to 1.
Sampling implementation At every step, the sampling algorithm samples up to d (that determines the size of the neighbourhood ofx that the hill climber algorithm can search over; in our case d = 1) indices ofx and flips their value by multiplying by −1. It then calculates the objective functionφ(x) and accepts the changes if the cost increased, and rejects otherwise. This is repeated until the algorithm rejects 10n times 12 in a row, at which point we assume that the algorithm has converged.
Exact implementation At every step, the exact implementation calculates the cost increase of every possible change tox. This is done by constructing the matrix which consists of n copies ofx as columns, where we multiply all diagonal elements by −1. This represents all the possible changes ofx at a single step. Now we can useφ(B(x)) to calculate the cost of all possible changes simultaneously. This gives a vectorỹ containing the cost value of all the n possible new configurations ofx (assuming d = 1). These values are compared to the old cost value and those that give a positive increase are saved in a list of marked elements. We consider those variables for which a change (being multiplied by −1) incurs a positive increase in the objective function as marked, the all other variables as unmarked. The size of this list gives the exact value of t, the number of marked variables. The exact implementation of simple quantum hill-climber selects one marked element from the list at random. The steep quantum hill-climber selects the marked element with the highest cost value.

Data structure
The exact implementation is feasible due to the fact that we use matrix multiplication to calculate the cost values. However, it can still be quite slow for larger instances. To remedy this, to an extent, we add an extra data structure that keeps track of the list of marked variables, rather than reconstruct it at every step. To do so we use the fact that any update is in some sense local. Let the i'th index ofx be the index that is updated. Then there is a subset of clauses {C j |C ij ̸ = 0} (rows of A where the i'th index of the clause is not zero). These are the only clauses that can change from being satisfied to not satisfied, or from not satisfied to satisfied, by changing the i'th index ofx. Not all variables ofx are contained in these clauses (only k variables get assigned a non-zero value in a clause). Exactly those variables that are, can change from being marked to not and visa versa. Hence we only need to consider this subset of variables when updating the list of marked variables. This severely reduces the computational cost of keeping track of marked items. As it turns out, this is efficient enough to avoid running-time limitations, but instead makes memory limitations the bottleneck.

Results
Here we present our results for estimating the run-times of the two quantum algorithms described previously. Specifically, we estimate the number of queries to any of the marking functions 13 f y from Section 4.2 by applying the bounds obtained in Section 2. We set c q = 2, 13 We could have also chosen to count queries to φ instead. Note that, after finding xt at step t, we know φ(xt) from the checking part of QSearch ∞ (used as a subroutine for both QSearch and QMax), so every query to fx t corresponds one query to φ. The difference between counting queries to the marking functions versus counting queries to φ occurs at initialisation, where we need one extra query to φ to compute the function value of the initial bit string that is not taken into account when counting queries to the marking functions. Hence, for a total of T calls to either QSearch or QMax, the number of queries to φ equals the number of queries to the marking functions plus T . This relationship holds for both the classical and quantum query counts. In our comparison, we chose to compare queries to marking functions, because this is where the speedup manifests itself. since the quantum algorithms for max-k-sat make queries to an oracle O fy , which requires 2 queries to f y to implement. Figure 2: Numerical results for the query counts on randomly generated max-k-sat instances, with n variables and m = rn weighted (uniformly random between 0 and 1) clauses, of the proposed classical and quantum algorithms that implement a hill climber search routine. All hill climbers only consider a d = 1 level neighbourhood, so the local search space size is at any step k equal to n. The horizontal axis indicates the total amount of variables n and the vertical axis the amount of queries made to any of the marking functions. Each data point corresponds to the average over 10 randomly generated instances and the shaded area represents one standard deviation. In every sub-figure the inlet plots the fractional number of weighted satisfied clauses, defined as φ(x * )/W , where W = i∈[m] w i is total weight on the m clauses and φ(x * ) the objective function value for the obtained solution x * , the x-axis of the subplots is the number of nodes n. The blue and orange lines in the sub-figures are overlapping, this shows that the quality of the solutions found is comparable for the different algorithms. The classical algorithms are indicated by a '• ′ , the respective quantum algorithms by a '×'.
We tested our algorithms on different instances of max-k-sat to see what kind of speed-ups can be attained on average-case instances. The instances were generated using a random assignment of k variables per clause. Figure 2 shows the average number of queries made by our classical and quantum algorithms. There, n is the number of variables, k the number of variables per clause, r is multiplied by n to get the number of clauses m = rn. We observe that the behaviour is very similar amongst the different parameter choices in the random max-k-sat generation. We find, as one might expect, that both quantum versions of the steep and simple hill climbers achieve better asymptotic scaling when compared to their classical counterparts: here better asymptotic scaling means that we expect that the polynomial which describes the number of queries made to the cost function has a lower degree for the quantum algorithm than it has for the classical one. This is indicated by the difference in slope of the plots in Figure 2, as the number of queries against the problem size is plotted on a log-log scale and thus gives information about the degree of this polynomial, provided n is large enough. On the contrary, only the simple quantum hill climber is able to also beat the classical algorithm in terms the of absolute number of queries for the problem sizes considered (since only in this case the plot corresponding to the number of quantum queries goes below the classical one). However, since it achieves better scaling, we expect that for slightly larger n (larger than 10 4 ) the steep quantum hill climber will also start to beat the classical counterpart on average, as one would expect. The interesting point here is that, even for a fairly simple model that only takes query counts into consideration, the problem sizes need to already be quite large in order to achieve a quantum speedup. Table 1 shows the empirically observed asymptotic scaling behaviour of our algorithms. By taking a linear fit in the log-log plot we can estimate the scaling exponents of the different algorithms. In Table 1 we show the relative speedup of our quantum algorithms compared to their classical counterpart. We see that a part of the theoretical speedup is lost. This is likely due to a combination of the fact that the theoretical speedup is a per-step speedup that does not affect the total number of steps taken, only the number of queries required for each individual step, and the fact that on relatively small instances the extra overhead required to run the quantum algorithms is significant. Figure 3: Several numerical results for sampling and exact methods for the simple hill climber on random 2-SAT instances with r = 3n clauses, with N samples = 130 a): average query count, b) average running times and c) peak memory usage. The first plot shows a comparison of query count between the sampling and exact method. Note how for the smallest value of n our sampling method fails to yield a proper upper bound: this is due to the fact that N samples > n, which results in the fact that with high probability we fail to turn on Grover at all, and as a consequence we underestimate the expectation value (since the contribution from Grover to the expectation value is high). The second plot shows a run-time comparison between the exact (with and without data structure) and the sampling method. The third plot shows a comparison of peak memory usage between the exact method (with data structure), and the sampling method. The exact method without data-structure is not shown in the third plot as it has the same memory usage as the sampling method.
As discussed in Section 3, when instances become too large we cannot use an exact method anymore to keep tack of the number of marked items. In Figure 3 we show a comparison between the exact methods and our introduced sampling method for estimating an upper bound on the expected number of queries, for N samples = 130. We find that our estimation method provides a decent upper bound on the number of queries in expectation. For the exact methods we consider two different implementations to acquire the necessary information for calculating the expected number of queries at every step. The first one runs over the entire search space at every step acquiring the number of marked items. The second one uses the data-structure -described in Section 4.3.2 -that exploits the locality of the instances to update the fraction of 'good elements' in the neighborhood of a given bit string.
Regarding the run-times of our classical simulations, Figure 3 shows that both sampling and the data-structure method considerably outperform the exact implementation that runs over the entire search space. However, the extra added data structure comes at the cost of additional memory requirements, which become the bottleneck as we consider problems at a larger scale. Therefore, for instances where n > 10 4 , we are limited to the usage of the sampling methods to obtain results. Finally, we note that the data structure method is very context-specific (i.e. here the data structure is specific to max-k-sat ) and might not always be possible, whereas the estimation method is applicable generally.  Table 1: Shown are the theoretically obtained per-iteration complexities of our algorithms compared to their empirically observed speedups across the entire algorithm. Here 'absolute speedup' refers to the quantum algorithm making fewer (estimated) queries than the classical algorithm on the datasets that we considered. The numbers shown in the rightmost column measure the speedup achieved by the quantum algorithm: these are obtained by a linear weighted fit on the plots of Figure 2, which gives the scaling exponent of the expected query counts as a function of the problem size; the number in the table is the classical exponent divided by the corresponding quantum exponent. The numbers are larger than one in all cases, indicating a (modest) quantum speedup. The maximum speedup that can be obtained is 2, since that would correspond to the full quadratic per-step speedup manifesting across the entire run-time. Note that the steep hill-climber would likely also achieve an absolute speed-up if we considered slightly larger problem instances, as it achieves a better scaling than it's classical counterpart.

Summary of results
Our main findings can be summarised as follows.
• The quantum hill climbers obtain favourable scaling compared to their classical counterparts, but only one of them (the simple hill climber) obtained an absolute (query) speedup compared to its classical counterparts.
• Our estimation procedure gave reliable upper bounds on the complexities of the quantum algorithms as compared to an exact procedure, confirming our theoretical analysis from Section 3.
• Our estimation procedure significantly decreased the computational cost of obtaining run-time estimates in the way considered in this paper. An exact approach that made use of a particular data structure yielded similar results, however it added large memory costs, and such an approach will always be very context-specific and sometimes not possible at all.
• Classical heuristic algorithms tend to work by making many fast-to-compute but small updates to minimize the cost function, a structure that does not lend itself to significant quantum speedups.
Funding CC was supported by QuantERA project QuantAlgo 680-91-034, with further funding provided by QuSoft and CWI. MF and JW were supported by the Dutch Ministry of Economic Affairs and Climate Policy (EZK), as part of the Quantum Delta NL programme. IN was supported by the DisQover project: a collaboration between QuSoft and ABN AMRO, and recieved funding from ABN AMRO and CWI.

A Detailed analysis of QSearch
In this section of the appendix we give details to support the bounds on the success probability and expected number of queries made by our implementation of QSearch given in Section 2.1.
As mentioned in the beginning of Section 2, queries to the quantum oracle O g come with a weight of c q relative to the classical queries to g. In Sections A.1 and A.2, a query refers to a query to O g . Only in Section A.3 will we include the classical queries, and then the queries to the quantum oracle will by multiplied by an extra factor of c q (where needed) in the expressions obtained for the expected number of queries to g for QSearch.

A.1 Improved bounds
To start with, let us briefly go over the original analysis of Boyer et al. [7], and improve some of the bounds where we can. The analysis in this subsection applies to QSearch ∞ .
Suppose we have a list L with t marked items, and let θ be such that

Moreover, let
Now, the following lemma provides a lower bound to the success probability of finding a marked item with a single Grover run.
Lemma 13. (Lemma 2 from [7]). Suppose we have a list L with t marked items, and let θ be such that sin 2 (θ) = t/|L|, m ∈ N >0 an arbitrary positive integer. Then, the probability P m of finding a marked element after doing j Grover iterations, where j is a non-negative integer smaller than m chosen uniformly at random, is given by Consequently, if m ≥ m t , then P m ≥ 1 4 . According to the algorithm description of QSearch ∞ , we initialize m = λ and λ = 6/5. After every run, we multiply m by λ. The moment m > m t , we reach the so-called critical stage. As Boyer et al. observe, because of Lemma 13, once in the critical stage every run has probability of at least 1/4 to find a marked item, and this lower bound can be used to upper bound the expected number of Grover iterations required to find a marked item.
The issue with the requirement m ≥ m t is that, when θ → π/2, m t → ∞. For this reason, Boyer et al. exclude the regime of θ close to π/2 -which corresponds to the case of many marked items -by classical sampling. However, as we show below, in the regime |L|/4 < t ≤ |L|, or π/6 ≤ θ ≤ π/2, we actually have P m ≥ 1/4 for every integer m > 0. More precisely, we have the following lemma.
Combining the above, we have that .
To start with, let us bound the expected number of queries of QSearch ∞ to the quantum oracle.

Lemma 15. The expected number of queries E Quantum
QSearch ∞ to the quantum oracle O g used by QSearch ∞ when applied to a list L with t marked items can be upper bounded by where λ =  15 In terms of queries to the function g, a check actually requires one query to g, not cq queries to g. Since cq ≥ 1, and we are working with upper bounds, and the checking only happens ⌈log(mt)⌉ times, we choose to count the check as cq queries to g to keep the formulas clean. As side note, when we say that we know the value of R for QMax or φ for max-k-sat 'from the previous step', this value comes from these classical checks (that occur as a subroutine of both QSearch and QMax).

For 1 ≤ t < |L|
4 , we can repeat the analysis of Boyer et al. Before we reach the critical stage, i.e. the cycles for which m < m t , the number of queries is upper bounded by Once m ≥ m t , we are in the critical stage, and by Lemma 13, P m ≥ 1 4 . Hence, the expected number of queries is upper bounded by Including the upper bound to the expected number of queries before the critical stage, we arrive 17 at Eq. (20).
It should be noted that the bound of 9 2 m t + ⌈log λ (m t )⌉ − 3 actually holds for all t, but only becomes useful when we can further bound m t (as we do in the next section, which requires that number of marked items is not too large, e.g. 1 ≤ t ≤ |L|/4).

A.2 Success probability
Next, we focus on QSearch as described by Algorithm 2, which includes a time-out, making it a finite-time bounded-error algorithm whose success probability and complexity we analyse in the subsections that follow.
As a consequence of Lemma 14, we do not need to first sample classically to exclude the case of many marked items before using Grover, because the success probability of a single run is ≥ 1 4 also in the regime of many marked items. Hence, the success probability of QSearch only depends on the success probability of the Grover search part (lines 6 -19 of Algorithm 2), which we investigate next. Note that, in our implementation of QSearch given in Algorithm 2, we do include the classical sampling part because it can make the algorithm efficient in the regime of many marked items.
If t = 0, then the Grover search part will run the maximum number N runs of Grover runs, and return 'no marked item found', which means it will always return the correct answer. Thus, we can restrict ourselves to the case t > 0. In this case, the Grover part can only fail when every Grover run fails. For a single Grover run to fail, it has to not find a marked item before the time-out, meaning that QSearch ∞ would have required more than Q max queries to find a marked item. 17 The reason that we pick λ = 6/5 is that it minimizes the coefficient of the dominant term mt -which by the above two expressions is given by c(λ) = 1 2 λ λ−1 + λ 4−3λ -on the interval λ ∈ (1, 4/3). In particular, the choice λ = 6/5 is optimal.
In conclusion Given failure probability of at most ϵ > 0, recall that we execute at most N runs = log 3 (1/ϵ) Grover runs. Therefore, the probability that QSearch succeeds satisfies

A.3 Expected number of queries
In this section of the appendix, we upper bound the expected number of queries to g made by QSearch.

A.3.1 Classical sampling part
For fixed |L| and t, the probability that a vertex drawn uniformly at random is marked is given by the fraction f = t/|L|. Now, if we draw at most N samples classical samples uniformly at random, and then use Grover search if all N samples samples turn out to be unmarked, this takes a total of queries to g in expectation, where E Grover is the expected number of queries to the quantum oracle O g made by all N runs Grover runs of QSearch combined. If t = 0, then the above expression becomes Hence, for 1 ≤ t ≤ |L|, we conclude that 20

A.3.2 Grover part
Next, we investigate the expected number of queries E Grover to O g in the Grover part of QSearch.
No marked items in the list If t = 0, then every run runs to completion without finding a marked element. In total, a single run executes at most Q max queries to O g . Since for t = 0 we perform N runs , the expected total number of queries to O g in case of no marked items is upper bounded by E Grover ≤ N runs α |L| ≤ 9.2N runs |L| , where in the last inequality we used the expression for α in Eq. (24).
Marked items in the list i.e. t > 0. To start with, we want to bound the number of Grover iterations executed in a single run. To do so, we first examine a single run of QSearch ∞ , i.e. without a timeout. For k ∈ N, let us write p k for the probability that the random variable X -introduced in Appendix A.2 corresponding to the number of queries to O g needed for QSearch ∞ to find a marked item -assumes the value k, i.e. the probability that a single run of QSearch ∞ would have found a marked item using a total of k queries. Then, in terms of the probabilities {p k } k∈N , we have A single Grover run fails exactly when QSearch ∞ would have timed-out. Hence, the probability of a single Grover run succeeding is given by Conditioned on the outcome that step 2 finds a marked item -which happens with probability q success , a single run requires in expectation queries. Due to Lemma 16, we have that 20 Interestingly, this expression is the same as the one we would have obtained using the following procedure: with probability (1 − f ) N samples , we do Grover search, which takes N Grover g steps, and with probability 1 − (1 − f ) N samples , we classically sample vertices, of which we need 1 f . The nice thing about the implementation we use, is that our implementation of QSearch mimics this behaviour without knowing f in advance (meaning that we're not flipping a coin that returns heads with probability (1 − f ) N samples ).

Lemma 16. Let
k=0 be a discrete probability on N and let E = ∞ k=0 p k k be the expectation value of sampling a number from N according to P. If there is a promise that the sampled number k is less than some value K ∈ N, the resulting probability distribution P ′ is renormalized by p K = K k=0 p k , that is: . Now, we claim that i.e. the expectation value of drawing a number bellow K according to P ′ is bounded from above by the original expectation value E.
Proof. We distinguish two cases. Case 1: K ≤ E. In this case, we have Similarly, conditioned on the outcome that we do not find a marked item (due to the timeout) -which happens with probability q fail = 1 − q success , the number of queries Q fail in a single run can trivially be bounded by Because we execute at most N runs Grover runs, the expected number of queries in its entirety is given by the following sum We can simplify the series above as follows where, going from the first to the second line we have used the derived expression for E(f ) in Eq. (30) with f = 1 − q fail and N samples = N runs − 1. We conclude that, for t > 0, Recall that the probability q fail that a single Grover run fails is given by 2. However, we don't have a lower bound on q fail , and therefore the best we can do with the 1 − q Nruns fail term is to upper bound it by 1, which is equivalent to taking the N runs → ∞ limit. As a consequence, for t > 0, our upper bound for E Grover is independent of the number of Grover runs N runs . The resulting upper bound for E Grover is given by where Q fail is bounded by Eq. (35), and Q success given by Eq. (33) will be bounded in the subsection below.

A.3.3 In conclusion
Given a list L with t marked items, a failure probability of ϵ, and a maximum number of classical samples N samples , QSearch(L, N samples , ϵ) executes at most N runs = ⌈log 3 (1/ϵ)⌉ Grover runs.
If t = 0, then by Eqs. (29) and (32), the expected total number of queries to g is bounded from above by If t > 0, then by Eq. (31) we have By Eq. (37), E Grover satisfies the bounds below In the second line we have used Eq (22 Depending on the number of marked items t, we can bound E[X] as follows. • If 1 ≤ t ≤ |L| 4 , then by Eq. (20), The rightmost inequality follows from the analysis leading up to Eq. (25).
• In case |L| In the above formulas, by Eq. (24), The expressions obtained for the expected number of queries to g are presented more concisely in Lemma 4.

A.4 Worst-case behaviour of QSearch Zalka
The two steps of this algorithm are given in Section 2.2. Here we analyse the worst-case complexity of that implementation. For the first step, recall that (see Lemma 2) when there are t marked items (and we know t), then exact Grover search 21 can find and return one with state away from the marked subspace, and as we show in Lemma 14 we have p ≥ 1/4 when t > π 2 /4−1 π 2 /4 > |L| 4 , and so for both cases we have p ≥ 1/4. Plugging this lower bound on p into the expression for the total number of Grover iterations, and taking into account that a single query to O g corresponds to c q queries to g, we see that the total number of queries to g made by the algorithm is at most

B Detailed analysis of QMax
In this section we compute the expected number of queries to g made by QMax ∞ , and we give further upper bounds to the obtained expression for the expected number of queries.

B.1 Expected number of queries
Based on the the proof idea of [1], we provide a more accurate proof and expression of the expected number of queries made by QMax ∞ when searching the list L with t marked items, as stated in Lemma 6 (restated below for convenience).
where F (|L|, t) is defined by Eq. (3). Here, c q is the number of queries to f i required to implement the oracle O f i (which we assume to be the same for all i).
Proof. Let us use the shorthand notation for the expected number of queries made by QSearch ∞ to the quantum oracle when searching a list L with t marked items (suppressing the L dependence for notational convenience). Note that by Lemma 15, where F (|L|, t) is given by Eq. (3). Moreover, let E(t) denote the expected number of queries to the quantum oracles O f i for finding the maximum when t items are marked: i.e. the expected number of queries to find the maximum given that y is set to the t + 1-th item of L when ordered according to R in descending order. We first compute the expected number of queries to the quantum oracles O f i , and then include the factor of c q in the end 23 .
We have the following recursion relation for E(t): (because, after applying QSearch ∞ , with equal probability we find one of the t marked items in L with a larger value for R than the current index y). Note that E(0) = 0. Using Eq. (39) for t and t − 1: and subtracting the bottom equation from the top equation and then dividing by t yields Since the above equation holds for every t, we can use the equation for t − 1 and plug it into Eq. (40), and then do the same for t − 2, etc., up to t = 2. We then obtain We next rewrite 24 the sums above as follows: where, when going to the third line, we have used that E(1) = Q(1). Now, since, at initialization, QMax ∞ chooses an index y uniformly at random, the expected number of queries of QMax ∞ to the quantum oracles is given by 24 This is where the proof of [1] becomes imprecise. The authors use upper bounds for Q(u) and Q(u − 1) in order to upper bound E(t). However, in order to obtain an upper bound for E(t), a lower bound for the Q(u − 1) terms should be used, because they come with a minus sign. The correct way to continue the proof is to postpone using upper bounds for Q(u) until after rewriting the sum.

B.2 Upper bounds to the expected number of queries
Next, we will upper bound the series that upper bounds the expected number of queries to g made by QMax ∞ by a simpler expression. By Lem. 6  In order to prove Eq. (44), let us define q := 1 − f ∈ [0, 1). Now, we have Next, we investigate the above series without the q i 's, which turns out to be convergent. Indeed, for n ∈ N we have Proof.
where the final equality holds from (52).
Hence, we can take as an estimator for the number of classical queries just min(l, N ), where l is the number of items sampled from L before finding a marked one. Now we turn our attention to the quantum contribution to the number of queries, which is given by ( We already have from Eq. (13) an estimator E estimator Grover (l) such that E E estimator Grover (l) ≥ E Grover (|L|, t). We seek a function h 2 such that, when multiplied by E estimator Grover we also have E h 2 (l)E estimator Grover (l) ≥ (1 − f ) N E Grover (|L|, t). Toward this end, let ( It remains to show that E h 2 (l)E estimator Grover (l) ≥ (1 − f ) N E Grover (|L|, t).
Lemma 18. Let f, g : N → R be non-negative non-decreasing functions, and x a random variable on N. Then Proof. Fix x, y, ∈ N. Because f and g are non-decreasing, we have (f (x) − f (y))(g(x) − g(y)) ≥ 0 , and therefore f (x)g(x) + f (y)g(y) ≥ f (x)g(y) + f (y)g(x) .
Because f and g are non-negative functions on N, f , g and the product f g are Lesbesgue integrable with respect to the probability measure. Combining Lemma 18 with the observations above, as well as the fact that both h 2 and E estimator Grover are non-decreasing, we conclude that the function H(l) = h 1 (l)+h 2 (l)c q E estimator Grover (l) satisfies for t ≥ 1. Hence, H is an estimator that always upper bounds, in expectation, the number of queries made by QSearch when there is at least one marked item.