UvA-DARE

We study to what extent quantum algorithms can speed up solving convex optimization problems. Following the classical literature we assume access to a convex set via various oracles, and we examine the eﬃciency of reductions between the diﬀerent oracles. In particular, we show how a separation oracle can be implemented using e O (1) quantum queries to a membership oracle, which is an exponential quantum speed-up over the Ω( n ) membership queries that are needed classically. We show that a quantum computer can very ef-ﬁciently compute an approximate subgradient of a convex Lipschitz function. Combining this with a simpliﬁcation of recent classical work of Lee, Sidford, and Vempala gives our eﬃcient separation oracle. This in turn implies, via a known algorithm, that e O ( n ) quantum queries to a membership oracle suﬃce to implement an optimization oracle (the best known classical upper bound on the number of membership queries is quadratic). We also prove several lower bounds: Ω( √ n ) quantum separation (or membership) queries are needed for optimization if the algorithm knows an interior point of the convex set, and Ω( n ) quantum separation queries are needed if it does not.


Introduction
Optimization is a fundamental problem in mathematics and computer science, with many real-world applications.As people try to solve larger and larger optimization problems, the efficiency of optimization becomes more and more important, motivating us to find the best possible algorithms.Recent experimental progress on building quantum computers draws attention to new approaches to the problem: can we solve optimization problems more efficiently by exploiting quantum effects such as superposition, interference, and entanglement?For many discrete optimization problems [Gro96, DH96, Szeg04, DHHM06, AŠ06] significant speed-ups have been shown, but less is known about continuous optimization problems.
One of the most successful continuous optimization paradigms is convex optimization, which optimizes a convex function over a convex set that is given explicitly (by a set of constraints) or implicitly (by an oracle).See Bubeck [Bub15] for a recent survey.Quantum algorithms for convex optimization have been considered before.In 2008, Jordan [Jor08] described a faster quantum algorithm for minimizing quadratic functions.Recently, for an important class of convex optimization problems (semidefinite optimization) quantum speed-ups were achieved using algorithms whose runtime scales polynomially with the desired precision and some geometric parameters [BS17, vAGGdW17, BKL + 19, vAG19].
However, many convex optimization problems can be solved classically using algorithms whose runtime scales logarithmically with the desired precision and the relevant geometric parameters.We are aware of only one quantum speed-up which is partially in this regime, namely the very recent quantum interior point method of Kerenidis and Prakash [KP18].In this paper we look at general convex optimization problems, considering algorithms that have such favorable logarithmic scaling with the precision.
The generic problem in convex optimization is minimizing a convex function f : K → R ∪ {∞}, where K ⊆ R n is a convex set.We consider the setting where an interior point x 0 ∈ int(K) is given and radii r, R > 0 are known such that B(x 0 , r) ⊆ K ⊆ B(x 0 , R), where B(x 0 , r) is the Euclidean ball of radius r centered at x 0 .
It is well-known that if the convex function is bounded on K, then we can equivalently consider the problem of minimizing a linear function over a different convex set K ⊆ R n+1 , namely the epigraph K = {(x, µ) : x ∈ K, f (x) ≥ µ} of f .Accessing K is easy given access to K and f , and the parameters involved will be similar.Conversely, for any linear optimization problem over an unknown convex set K, there is an equivalent optimization problem over a known convex set (say, the ball), with an unknown bounded convex objective function f that can be evaluated easily given access to K. From now on we therefore focus on optimizing a known linear function over an unknown convex set.
We consider the setting where access to the convex set is given only in a black-box manner, through an oracle.The five basic problems (oracles) in convex optimization identified by Grötschel, Lovász, and Schrijver [GLS88] are: membership, separation, optimization, violation, and validity (see Section 2 for the definitions).They showed that all five basic problems are polynomial-time equivalent.That is, given an oracle O for one of these problems, one can implement an oracle for any of the other problems using a polynomial number of calls to O and polynomially many other elementary operations.Subsequent work made these polynomial-time reductions more efficient, reducing the degree of the polynomials.Recently Lee et al. [LSV18], in the classical setting, showed that with O n 2 calls 1 to a membership oracle (and O n 3 other elementary arithmetic operations) one can solve an optimization problem.They did so by showing that O(n) calls to a membership oracle suffice to do separation, and then composing this with the known fact [LSW15] (see also [LSV18,Theorem 15]) that O(n) calls to a separation oracle suffice for optimization.
Our main result (Section 4) shows that on a quantum computer, O(1) calls to a membership oracle suffice to implement a separation oracle, and hence (by the known classical reduction from optimization to separation) O(n) calls to a membership oracle suffice for optimization. 2Lee et al. [LSV18] use a geometric idea to reduce separation to finding an approximate subgradient of a convex Lipschitz function.They then show that O(n) evaluations of a convex Lipschitz function suffice to get an approximate subgradient.Our contributions here are twofold (Section 3 and 4).We use the same geometric idea, but we provide a simpler way to compute an approximate subgradient of a convex Lipschitz function (Section 3).We point out that this new algorithm is purely classical.Besides 1 Here, and in the rest of the paper, the notation O(•) is used to hide polylogarithmic factors in n, r, R, ε. 2 Although not stated explicitly in our results, we also use O n 3 additional operations for optimization using membership, like [LSV18].This is because our quantum algorithm for separation uses only O(n) gates in addition to the O(1) membership queries, and we use the same reduction from optimization to separation as [LSV18].If queries themselves have significant time complexity, then our algorithm does lead to a speedup in time complexity over the best known classical algorithm.For example, if each membership query (with the required precision) takes time O n 2 to implement, then our quantum algorithm for optimization has time complexity O n 3 , while the classical algorithm will use time O n 4 because it uses O n 2 membership queries.
being simpler, the main advantage of our algorithm is that it is suitable for a quantum speed-up using known quantum algorithms (Jordan's algorithm) for computing approximate (sub)gradients [Jor05,GAW19], which we show in Section 4. To show our quantum speed-up, we have to extend Jordan's quantum algorithm for gradient-computation to the case of convex Lipschitz functions.
As a second set of results, in Section 5 we provide lower bounds on the number of membership or separation queries needed to implement several other oracles.We show that our quantum reduction from separation to membership indeed improves over the best possible classical reduction: Ω(n) classical membership queries are needed to do separation. 3e only have partial results regarding the optimality of the reduction from optimization to separation.In the setting where we are not given an interior point of the set K, we can prove an essentially optimal Ω(n) lower bound on the number of quantum queries to a separation oracle needed to do optimization, using the general adversary bound.This lower bound implies that a quantum computer offers no query speed-up over a classical computer for the task of finding an interior point.
However, for the case of quantum algorithms that do know an interior point, we are only able to prove an Ω( √ n) lower bound.In the classical setting, regardless of whether or not we know an interior point, the reduction uses Θ(n) queries.This raises the interesting question of whether knowing an interior point can lead to a better quantum algorithm.We therefore view closing the gap between upper and lower bound as an important direction for future work.
Finally, we briefly mention (Section 6) how to obtain upper and lower bounds for some of the other oracle reductions, using a convex polarity argument.As we show, in the setting where we are given an interior point, the relation between membership and separation is analogous to the relation between validity and optimization.In particular, our better quantum algorithm for separation using membership queries implies that on a quantum computer O(1) queries to a validity oracle suffice to implement an optimization oracle.That is, on a quantum computer, finding the optimal value is equivalent to finding an optimizer.Also, the same polarity argument shows that algorithms for optimization using separation are essentially equivalent to algorithms for separation using optimization.In particular, this turns our lower bound on the number of separation queries needed to implement an optimization oracle into a lower bound on the reverse direction.
Figure 1 gives an informal presentation of our results; the upper bounds arise from oracle reductions, the (change in) accuracy is ignored here for simplicity.The abovementioned polarity manifests itself in the central symmetry of the figure.

Related independent work.
In independent simultaneous work, Chakrabarti, Childs, Li, and Wu [CCLW18] discovered a similar upper bound as ours: combining the recent classical work of Lee et al. [LSV18] with a quantum algorithm for computing gradients, they show how to implement an optimization oracle via O(n) quantum queries to a membership oracle and to an oracle for the objective function.Their proof stays quite close to [LSV18] while ours first simplifies some of the technical lemmas of [LSV18], giving us a slightly simpler presentation and a better error-dependence of the resulting algorithm.They also prove several lower bounds that are similar to the ones we prove here.
Θ( 1) Figure 1: The top and bottom diagram illustrate the relations between the basic (weak) oracles for respectively classical and quantum queries, with boldface entries marking our new results.All upper and lower bounds hold in the setting where we know an interior point of K, except the * -marked Ω(n) lower bound on the number of separation queries needed for optimization.Notice the central symmetry of the diagrams, which is a consequence of polarity.

Preliminaries
We use [n] := {1, 2, . . ., n}.For p ≥ 1, ε ≥ 0, and a set C ⊆ R n we let be the set of points of distance at most ε from C in the p -norm.When C = {x} is a singleton set we abuse notation and write B p (x, ε).We overload notation by setting Whenever p is omitted it is assumed that p = 2.
Recall that a function f : Definition 1 (Subgradient).Let C ⊆ R n be convex and let x be an element of the interior of C. For a convex function f : C → R we denote by ∂f (x) the set of subgradients of f at x, i.e., those vectors g satisfying Note that in the above definition ∂f (x) = ∅ due to convexity.If f : C → R is L-Lipschitz, then for any x in the interior of C and any g ∈ ∂f (x) we have g ≤ L, as follows.Consider a y ∈ C such that y − x = αg for some α > 0. Then since g is a subgradient of f at x we have and therefore g ≤ L.
We will assume familiarity with quantum computing [NC00].In particular, a standard quantum oracle corresponds to a unitary transformation that acts on two (finitedimensional) registers, where the first register contains the query and the answer is added to the second register.For example, a function evaluation oracle for f : X → Y would map |x, 0 to |x, f (x) , where |x and |f (x) are basis states corresponding to binary representations of x and f (x) respectively.Unlike classical algorithms, quantum computers can apply such an oracle to a superposition of different y's.They are also allowed to apply the inverse of a unitary oracle.
The standard quantum oracle described above models problems where there is a single correct answer to a query.When there are multiple good answers (for instance, different good approximations to the correct value) and the oracle is only required to give a correct answer with high probability, then we will work with the more liberal notion of relational quantum oracles.
Definition 2 (Relational quantum oracle).Let F : X → P(Y ) be a function, such that for each x ∈ X the subset F(x) ⊆ Y is the set of valid answers to an x query.A relational quantum oracle for F which answers queries with success probability ≥ 1 − ρ, is a unitary that for all x ∈ X maps where |ψ x,y denotes some normalized quantum state and y∈F (x) |α x,y | 2 ≥ 1 − ρ.Thus measuring the second register of U |x, 0, 0 gives a valid answer to the x query with probability at least 1 − ρ.This definition is very natural for cases where the oracle is implemented by a quantum algorithm that produces a valid answer with probability ≥ 1 − ρ.In order to achieve our quantum speed-ups we will always assume access to the inverse U † of the relational oracle as well, which is justified if U comes from an efficiently implementable quantum algorithm.

Oracles for convex sets
The five basic oracles for a convex set K that we consider are as follows (in contrast with the original [GLS88], we allow some error probability ρ in these oracles as in [LSV18]).Throughout we will assume that real vectors are represented with polylog(nR/(rε)) bits of precision per coordinate.In particular, we assume that the input / output of the following oracles is represented this way.4Definition 3 (Membership oracle MEM ε,ρ (K)).Queried with a vector y ∈ R n , the oracle, with success probability ≥ 1 − ρ, correctly asserts one of the following Definition 4 (Separation oracle SEP ε,ρ (K)).Queried with a vector y ∈ R n , the oracle, with success probability at least ≥ 1 − ρ, correctly asserts one of the following and in the second case it returns a unit vector g ∈ R n such that g, x ≤ g, y + ε for all x ∈ B(K, −ε).
Definition 5 (Optimization oracle OPT ε,ρ (K)).Queried with a unit vector c ∈ R n , the oracle, with probability ≥ 1 − ρ, does one of the following: • it returns a vector y ∈ R n such that y ∈ B(K, ε) and c, x ≤ c, y + ε for all x ∈ B(K, −ε), • or it correctly asserts that B(K, −ε) is empty.
Note that the above optimization oracle corresponds to maximizing a linear function over a convex set; we could equally well state it for minimization.
Definition 6 (Violation oracle VIOL ε,ρ (K)).Queried with a unit vector c ∈ R n and a real number γ, the oracle, with probability ≥ 1 − ρ, does one of the following: • it asserts that c, x ≤ γ + ε for all x ∈ B(K, −ε), • or it finds a vector y ∈ B(K, ε) such that c, y ≥ γ − ε.
Definition 7 (Validity oracle VAL ε,ρ (K)).Queried with a unit vector c ∈ R n and a real number γ, the oracle, with probability ≥ 1 − ρ, does one of the following: • or it asserts that c, y ≥ γ − ε for some y ∈ B(K, ε).
If in the above definitions both ε and ρ are equal to 0, then we call the oracle strong.If either is non-zero then we sometimes call it weak.
The above describes the classical oracles, and the quantum oracles are defined analogously, i.e., they are relational quantum oracles (see Definition 2), that use a binary representation for the input / output vectors.
When we discuss membership queries, we will always assume that we are given a small ball which lies inside the convex set.It is easy to see that without such a small ball one cannot obtain an optimization oracle using only poly(n) classical queries to a membership oracle (see, e.g., [GLS88,Sec. 4.1] or the example below).As the following example shows, the same holds for quantum queries.We will use a reduction from a version of the wellstudied search problem: It is not hard to see that if the access to z is given via classical queries i → z i , then Ω(N ) queries are needed.It is well known [BBBV97] that if we allow quantum queries, i.e., applications of the unitary |i |b → |i |z i ⊕ b , then Ω( √ N ) queries are needed.Now let N = 2 n and consider an input z ∈ {0, 1} N to the search problem.Let b ∈ {0, 1} n be the index such that z b = 1.Consider maximizing the linear function e, z (where e is the all-1 vector) over the set Clearly the optimal solution to this convex optimization problem, even with a small constant additive error in the answer, gives the solution to the search problem.However, a membership query is essentially equivalent to querying a bit of z and therefore Ω( √ N ) = Ω(2 n/2 ) quantum queries to the membership oracle are needed for optimization.

Computing approximate subgradients of convex Lipschitz functions
Here we show how to compute an approximate subgradient (at 0) of a convex Lipschitz function.That is, given a convex set C such that 0 ∈ int(C) and a convex function f : C → R, we show how to compute a vector g ∈ R n such that f (y) ≥ f (0) + g, y − a y − b for some real numbers a, b > 0 that will be defined later (see Lemma 12 and Lemma 18).The idea of the classical algorithm given in the next section is to pick a point z ∈ B ∞ (0, r 1 ) uniformly at random and use the finite difference ∇ (r 2 ) f (z) (defined below) as an approximate subgradient of f at 0; the radii r 1 and r 2 need to be chosen small to make the approximation good.This results in a slightly simplified version of the algorithm of Lee et al. [LSV18].In Section 3.2 we show how to improve on this classical algorithm on a quantum computer.

Classical approach
In the discussion that follows we will use the following approximation of the gradient.

Definition 8 (Finite-difference gradient approximation). For a function
where e i ∈ {0, 1} n is the vector that has a 1 only in its ith coordinate.Similarly we define We will also consider a similar approximation of the Laplacian (the trace of the Hessian) of a function.
Definition 9 (Finite-difference Laplace approximation).For a function f : C → R, a real r > 0, and a point x ∈ R n such that B 1 (x, r) ⊆ C, and i ∈ [n], we define Similarly Note that for a convex function we have The next two lemmas will be needed in the proof of the main result of this section, Lemma 12.In Lemma 10 we give an upper bound on the deviation g − ∇ (r 2 ) f (z) 1 of a finite difference gradient approximation ∇ (r 2 ) f (z) from an actual subgradient g at the point z, in terms of the finite difference Laplace approximation ∆ (r 2 ) f (z).Then, in Lemma 11 we show that in expectation (over the points of a small ball around x), the finite difference Laplace approximation is small.Together with Markov's inequality this gives us good control over the quality of a finite difference gradient approximation.
. Now we can finish the proof by summing this inequality over all i ∈ [n].
Proof.Below we show that summing over i then proves the lemma. Let The last inequality above follows from multiplying the upper bound r 2 L on |h i | with the length r 2 of the integration intervals.
Note that the above lemma is stated and proved for continuous random variables, but the same proof holds if we have a uniform hypergrid over the same hypercube, providing a discrete version of the above result.In the discrete case, in order to get the same cancellations we need to assume that both r 1 and r 2 are integer multiples of the grid spacing.
We are now ready to prove the main result of this section.Informally, the next lemma proves that an approximate subgradient of a convex Lipschitz function f at 0 can be obtained by an algorithm that outputs ∇ (r 2 ) f (z) for a random z close enough to 0, where f is an approximate version of f .In other words, this lemma gives us a classical algorithm to compute an approximate subgradient of f using 2n classical queries to an approximate version of f .
Note that in the last line we switched from f to f , using that ∇ (r 2 ) f (z) and ∇ (r 2 ) f (z) differ by at most δ/r 2 in each coordinate.Our choice of r 2 gives δ δL ρr 1 and by Lemma 10-11 we have into the above lower bound on f (y) concludes the proof of the lemma.

Quantum improvements
In this section we show how to improve subgradient computation of convex functions via Jordan's quantum algorithm for gradient computation [Jor05].We use the formulation given by Gilyén et al. [GAW19,Lemma 20], for which we first introduce the following definition.
Definition 13 (Hyper-grid).For k ∈ N we define the following discretization of the interval (−1/2, 1/2): Similarly we define the n-dimensional hyper-grid G n k , which is the n-fold Cartesian product of G k with itself.
Note that an element of G n k can be represented using n × k (qu)bits.Basically, Jordan's algorithm just sets up a uniform superposition over all grid points, applies a "phase query" to f , and then a quantum Fourier transform over each coordinate.
for 99.9% of the points x ∈ G n m , then using a single query to a phase oracle O : |x → e 2πi2 m h(x) |x Jordan's gradient computation algorithm outputs a vector v ∈ R n such that: We now show that the above algorithm allows us to compute an approximate subgradient of a function f , even if we are only given standard oracle access to a function f which is sufficiently close to f .In particular, we will assume we are given access to a standard unitary oracle of a function f : That is, we assume we are given access to a unitary U acting as Note that if we can classically efficiently evaluate f , then it is well known that we can construct such a unitary as a small quantum circuit (see [NC00, Sec.1.4.1]).
The main idea is that, using one application of U , a phase gate corresponding to the output register, and another application of U † to uncompute the function value, we can implement a phase oracle for f .Moreover, Equation (4) below will also hold for f , with a slightly worse right-hand side, since f is close to f .A version of the following is proven in [GAW19, Theorem 21], for completeness we sketch a proof.

Corollary 15 (Gradient computation using approximate function evaluation
for 99.9% of the points x ∈ G n m , and we have access to a standard unitary oracle , by Lemma 14 we can compute a vector v ∈ R n which is a coordinatewise 4 Mapproximator of r 3B g: r with probability at least 2 3 .Note that the above success probability is per coordinate of g.However, repeating the whole procedure O log( n ρ ) times and taking the median of the resulting vectors coordinatewise gives a gradient approximator g with the desired approximation quality with probability at least 1 − ρ.For the proof of the gate complexity we refer 6 to [GAW19, Theorem 21] where the complexity of Jordan's algorithm is analyzed in detail.
Remark.With essentially the same approach, the above corollary of Jordan's quantum gradient computation algorithm can also be proven in the setting where our access to an approximation of f is not given by a standard quantum oracle but by a relational quantum oracle, see Appendix A for both the definition of this type of approximation to f and a proof of this corollary.
In terms of applications, we want to point out that if the membership oracle used in Section 4 comes from a deterministic algorithm, then we get a standard quantum oracle.Only when the membership oracle itself is relational (for example, when it is itself computed by a bounded-error quantum algorithm) do we need the more general setting of Appendix A.
We would like to apply the above corollary to compute gradients of a convex Lipschitz function.To that end, the function needs to be sufficiently close to a linear function on a small region.Fortunately convex Lipschitz functions have this property.The following two lemmas ensure that Equation (4) holds.Proof.Since f is convex and f (s) ≤ δ for all s ∈ S we immediately get that f (s ) ≤ δ for all s ∈ conv(S).Because f (0) = 0 and S = −S, due to convexity we get that f ) f (z) be the difference between f (z + y) and its linear approximator.Let S := {±r 2 e i : i ∈ [n]}.It is easy to see that d(0) = 0, 5 We can assume without loss of generality that the upper bound B is such that M is a power of two. 6The correspondence with the parametrization of [GAW19, Theorem 21] is ε S = −S, and conv(S) = B 1 (0, r 2 ).Also, for all s ∈ S we have |d(s)| ≤ r 2 2 ∆ (r 2 ) f (z)/2: Therefore Lemma 16 implies that sup y∈B 1 (0,r 2 ) |d(y We can now state the main result of this section, the quantum analogue of Lemma 12.
Lemma 18.Let r 1 > 0, L > 0, ρ ∈ (0, 1/3], and suppose δ ∈ (0, r ), and we have quantum query access 7 to f , which is a δ-approximate version of f , via a unitary U over a (fineenough) hypergrid of B ∞ (0, 2r 1 ).Then we can compute a g ∈ R n using O(log(n/ρ)) queries to U and U † , such that with probability ≥ 1 − ρ, we have and hence (by Cauchy-Schwarz) Proof.Let r 2 := δr 1 ρ nL and note that r 2 ≤ r 1 .The quantum algorithm works roughly as follows.It first picks a uniformly8 random z ∈ B ∞ (0, r 1 ).Then it uses Jordan's quantum algorithm to compute an approximate gradient at z by approximately evaluating f in superposition over a discrete hypergrid of B ∞ (z, r 2 /n).This then yields an approximate subgradient of f at 0.
We now work out this rough idea.Since Also as shown by Lemma 11 and Markov's inequality we have with probability ≥ 1 − ρ/2 over the choice of z.If z is such that Equation (6) holds, then we get 7 Using Corollary 29 instead of Corollary 15 shows that a relational quantum oracle also suffices as input.
Now apply the quantum algorithm of Corollary 15 with r = 2r 2 /n, c = f (z), g = ∇ (r 2 ) f (z), and B = Lr.This uses O(log(n/ρ)) queries to U and U † , and with probability ≥ 1 − ρ/2 computes an approximate gradient g such that Also, if z is such that Equation (6) holds, then by Lemma 10 we get that sup g∈∂f (z) and therefore by the triangle inequality and Equation (7) we get that Thus with probability at least 1 − ρ, for all y ∈ C and for all g ∈ ∂f (z) we have that 4 Algorithms for separation using membership queries Let K ⊆ R n be a convex set such that B(0, r) ⊆ K ⊆ B(0, R).Given a membership oracle 9 MEM ε,0 (K) as in Definition 3, we will construct a separation oracle SEP η,ρ (K) as in Definition 4. Let x be the point we want to separate from K. We first make a membership query to x itself, receiving answer x ∈ B(K, ε) or x ∈ B(K, −ε).Suppose x ∈ B(K, −ε), then we need to find a hyperplane that approximately separates x from K.
9 For simplicity we assume throughout this section that the membership oracle succeeds with certainty (i.e., its error probability is 0).This is easy to justify: suppose we have a classical T -query algorithm, which uses MEMε,0(K) queries and succeeds with probability at least 1−ρ.If we are given access to a MEM ε, 1 3 (K) oracle instead, then we can create a MEM ε, ρ T (K) oracle by O(log(T /ρ)) queries to MEM ε, 1 3 (K) and taking the majority of the answers.Then running the original algorithm with MEM ε, ρ T (K) will fail with probability at most 2ρ.Therefore the assumption of a membership oracle with error probability 0 can be removed at the expense of only a small logarithmic overhead in the number of queries.A similar argument works for the quantum case.
Due to the rotational symmetry of the separation problem, for ease of notation we assume that x = − x e n . 10We define h : R n−1 → R ∪ {∞} as h(y) := inf Our h is a bit different from the one used in [LSV18], but we can show that it has many of the same properties.Since K is a convex set, h is a convex function over R n−1 .As we show below, the function h is also Lipschitz (Lemma 19) and we can approximately compute its value using binary search with O(1) classical queries to a membership oracle (Lemma 20).Furthermore, an approximate subgradient of h at 0 allows to construct a hyperplane approximately separating x from K (Lemma 21).Combined with the results of Section 3 this leads to the main results of this section, Theorems 22 and 23, which show how to efficiently construct a separation oracle using respectively classical and quantum queries to a membership oracle.
Analogously to [LSV18, Lemma 12] we first show that our h is Lipschitz.
We will restrict our attention to the line through y and y , i.e., the line given by y + λz for z := y −y y −y .Define the point on this line and note that p ∈ B(0, r).Since y lies between y and p on the line it is a convex combination of these two points.In particular, since p − y = r − δ, it is the convex combination Due to convexity we have Now we show how to compute the value of h using membership queries to K.

Lemma 20. For all y
Proof.Let y ∈ B(0, r 2 ), then (y, h(y)) is a boundary point of K by the definition of h.Note that h(y) ∈ [−R, −r/2].Our goal is to perform binary search over this interval to find a good approximation of h(y).If we had access to a perfect membership oracle, then this would be straightforward.However, since our membership oracle can give back a wrong answer when queried with a point that is ε-close to the boundary of K, a more careful analysis is needed.
Suppose y n ≤ − r 2 is our current guess for h(y).We first show that (a) if (y, y n ) ∈ B(K, ε), then y n ≥ h(y) − δ, and (b) if (y, y n ) ∈ B(K, −ε), then y n ≤ h(y) + 2 3 δ.For the proof of (a) consider a g ∈ ∂h(y).Since g is a subgradient we have that h(z) ≥ h(y) + g, z − y for all z ∈ R n−1 .Hence, for all z ∈ R n−1 and z n such that (z, z n ) ∈ K we have where the first inequality is a rewriting of the subgradient inequality and the second inequality uses that z n ≥ h(z) since (z, z n ) ∈ K. Since (y, y n ) ∈ B(K, ε) it follows from the above inequality that Lemma 19 together with the argument of Equation (1) implies that g ≤ 2R r .Since we obtain the inequality of (a).For (b), consider the convex set C which is the convex hull of B((y, 0), r/2) and (y, h(y)).Note that B(C, −ε) is the convex hull of B((y, 0), r/2−ε) and y, h(y Now we can analyze the binary search algorithm.By making O log R δ MEM ε,0 (K) queries to points of the form (y, z), we can find a value The following lemma shows how to convert an approximate subgradient of h to a hyperplane that approximately separates x from K. Lemma 21.Suppose − x e n = x / ∈ B(K, −ε), and g ∈ R n−1 is an approximate subgradient of h at 0, meaning that for some a, b ∈ R and for all y Proof.Let us introduce the notation z = (y, z n ) and s := (−g, 1) = (−g, 1) s, then where the last inequality used claim (b) from the proof of Lemma 20 with the point (0, − x ) and δ = 3R r ε.
We now construct a separation oracle using O(n) classical queries to a membership oracle.In particular, to construct an η-precise separation oracle, we require an ε-precise membership oracle with The analogous result in [LSV18, Theorem 14] uses the stronger assumption Compared to this, our result scales better in terms of n, r R and ρ.
By Lemma 19 we know that h is 2R r -Lipschitz on B(0, r/2).By Lemma 20 we can evaluate h to within error δ using O log R δ queries to a MEM ε,0 (K) oracle.

Let us choose r
. Hence by Lemma 12, using O n log R δ queries to a MEM ε,0 (K) oracle, we can compute an approximate subgradient g such that with probability at least 1 − ρ we have Substituting the value of r 1 and δ we get h(y) ≥ h(0) + g, y − η 2R y − η 3 , which by Lemma 21 gives an s such that s, z ≥ s, x − 5 6 η − 2R r ε ≥ s, x − η for all z ∈ K Finally, we give a proof of our main result: we construct a separation oracle using O(1) quantum queries to a membership oracle.

Lower bounds
For a convex set K satisfying B(0, r) ⊆ K ⊆ B(0, R), we have shown in Theorem 23 that one can implement a SEP(K) oracle with O(1) quantum queries to a MEM(K) oracle if the membership oracle is sufficiently precise.In this section we first show that this is exponentially better than what can be achieved using classical access to a membership oracle.We also investigate how many queries to a membership/separation oracle are needed in order to implement an optimization oracle.Our results are as follows.
• We show that Ω(n) classical queries to a membership oracle are needed to implement a weak separation oracle.
• We show that Ω(n) classical (resp.Ω( √ n) quantum) queries to a separation oracle are needed to implement a weak optimization oracle; even when we know an interior point in the set.
• We show an Ω(n) lower bound on the number of classical and/or quantum queries to a separation oracle needed to optimize over the set when we do not know an interior point.
In this section we will always assume that the input oracle is a strong oracle but the output oracle is allowed to be a weak oracle with error ε.Furthermore, we will make sure that R, 1/r, and 1/ε are all upper bounded by a polynomial in n.This guarantees that the lower bound is based on the dimension of the problem, not the required precision.

Classical lower bound on the number of MEM queries needed for SEP
Here we show that a separation query can provide Ω(n) bits of information about the underlying convex set K; since a classical membership query returns a 0 or a 1 and hence can give at most 1 bit of information,12 this theorem immediately implies a lower bound of Ω(n) on the number of classical membership queries needed to implement one separation query.
Theorem 24.Let ε ≤ 1 48 .There exist a set of m = 2 Ω(n) convex sets K 1 , . . ., K m and points y, and such that the result of a classical query to SEP ε,0 (K i ) with the point y correctly identifies i.
Proof.Let h 1 , . . ., h m ∈ R n be a set of m = 2 Ω(n) entrywise non-negative unit vectors such that h i , h j ≤ 0.51 for all distinct i, j ∈ [m]. 13ow pick an i ∈ [m] and define Ki := {x : h i , x ≤ 0} ∩ B(0, √ n) and K i := B( Ki , ε).Then Ki = B(K i , −ε).Note that for x 0 = −e/3 we have B(x 0 , 1/3) ⊆ K i ⊆ B(x 0 , 2 √ n).We claim that a query to SEP ε,0 (K i ) with the point y = 3εe ∈ R n will identify h i .First note that y ∈ B(K i , ε), since Ki does not contain any entrywise positive vectors and y has distance at least 3ε from all vectors that have at least one non-positive entry.Hence a separation query with y must return a unit vector g that describes a valid separating hyperplane for K i .
On the other, if g describes a valid separating hyperplane for K j , then Now consider the specific point x that is the projection of g onto h ⊥ j (the hyperplane orthogonal to h j ) scaled by a factor √ n, i.e., x = √ n(g − g, h j h j ).Since h j , x = 0 and x ≤ √ n, we have x ∈ Kj .Choosing this x in (8) gives the following inequality 12 ≥ 19 20 .Since (8) holds for j = i, it follows that at least one of the two vectors g − h i and g + h i has length at most 2(1 − | g, h i | 2 ) ≤ √ 8ε; assume the former for simplicity.If (8) would also hold for j = i, then we would get a contradiction: Hence g uniquely identifies h i .

Lower bound on number of SEP queries for OPT (given an interior point)
We now consider lower bounding the number of quantum queries to a separation oracle needed to do optimization.In fact, we prove a lower bound on the number of separation queries needed for validity, which implies the same bound on optimization.We will use a reduction from a version14 of the well-studied search problem: Given z ∈ {0, 1} n such that either |z| = 0 or |z| = 1, decide which of the two holds.
It is not hard to see that if the access to z is given via classical queries, then Ω(n) queries are needed.It is well known [BBBV97] that if we allow quantum queries, then Ω( √ n) queries are needed (i.e., Grover's quantum search algorithm [Gro96] is optimal).We use this problem to show that there exist convex sets for which it is hard to construct a weak validity oracle, given a strong separation oracle.Since a separation oracle can be used as a membership oracle, this gives the same hardness result for constructing a weak validity oracle from a strong membership oracle.
Theorem 25.Let 0 < ρ ≤ 1/3.Let A be an algorithm that implements a VAL (5n) −1 ,ρ (K) oracle for every convex set K (with B(x 0 , r) ⊆ K ⊆ B(x 0 , R)) using only queries to a SEP 0,0 (K) oracle, and unitaries that are independent of K. Then the following statements are true, even when we restrict to convex sets K with r = 1/3 and R = 2 √ n: • if the queries to SEP 0,0 (K) are classical, then the algorithm uses Ω(n) queries.
• if the queries to SEP 0,0 (K) are quantum, then the algorithm uses Ω( √ n) queries.
Proof.Let z ∈ {0, 1} n have Hamming weight |z| = 0 or |z| = 1.We construct a set K z in such a way that solving the weak validity problem solves the search problem for z, while separation queries for K z can be answered using a single query to z.The known classical and quantum lower bounds on the search problem then imply the two claims of the theorem, respectively.Define We first show how to implement a strong separation oracle using a single query to z. Suppose the input is the point y.The strong separation oracle works as follows: then return a hyperplane that separates y from [−1, 1] n (and hence from K z ).
3. Otherwise, let i be such that y i > 0. Query z i .
(a) If z i = 1 and i is the only index such that y i > 0, then return that y ∈ B(K z , 0) = K z .
(b) If z i = 1 and there is a j = i such that y j > 0, return the separating hyperplane corresponding to x j ≤ y j .
(c) If z i = 0, then return the separating hyperplane x i ≤ y i .
We show that a validity query over K z with the direction c = 1 n and error ε = 1 5n solves the search problem: • If |z| = 0, then for all points x ∈ K 0 we have c, x ≤ 0. Thus, for all points x ∈ B(K 0 , ε) we have c, x ≤ ε < γ − ε.Hence the validity oracle will have to return that c, x ≤ γ + ε holds for all x ∈ B(K 0 , −ε), since the other possible output is not true.
Hence the validity oracle will have to return that c, x ≥ γ − ε holds for some x ∈ B(K z , ε), since the other possible output is not true.

Lower bound on number of SEP queries for OPT (without interior point)
We now lower bound the number of quantum queries to a separation oracle needed to solve the optimization problem, if our algorithm does not already know an interior point of K.In fact we prove a lower bound on finding a point close to K using separation queries, which implies the lower bound on the number of separation queries needed for optimization since OPT returns a point close to the set K.
We prove our lower bound by a reduction to the problem of learning z with firstdifference queries.Here one needs to find an initially unknown n-bit binary string z via a guessing game.For a given guess g ∈ {0, 1} n a query returns the first index in [n] for which the binary strings z and g differ (or it returns n + 1 if z = g).The goal is to recover z with as few guesses as possible.First we prove an Ω(n) quantum query lower bound for this problem. 15heorem 26 (Quantum lower bound for learning z with first-difference queries).Let z ∈ {0, 1} n be an unknown string accessible by an oracle acting as O z |g, b = |g, b⊕f (g, z) , where f (g, z) is the first index for which z and g differ, more precisely f (g, z) = min{i ∈ [n] : g i = z i } if g = z and f (g, z) = n + 1 otherwise.Then every quantum algorithm that outputs z with high probability uses at least Ω(n) queries to O z .
Proof.We will use the general adversary bound [HLŠ07].For this problem, we call Γ ∈ R 2 n ×2 n an adversary matrix if it is a non-zero matrix with zero diagonal whose rows and columns are indexed by all z ∈ {0, 1} n .For g ∈ {0, 1} n let us define ∆ g ∈ {0, 1} 2 n ×2 n such that the [z, z ] entry of ∆ g is 0 if and only if f (g, z) = f (g, z ).The general adversary bound tells us that for any adversary matrix Γ, the quantum query complexity of our problem is where "•" denotes the Hadamard product and • the operator norm.We claim that Equation (9) gives a lower bound of Ω(n) for the adversary matrix Γ defined as It is easy to see that Γ is indeed an adversary matrix since it is zero on the diagonal and non-zero everywhere else.Furthermore, the all-one vector e is an eigenvector of Γ with eigenvalue n2 n : So Γe = n2 n e and hence Γ ≥ n2 n .From the definition of ∆ g it follows that where χ [f (g,z) =f (g,z )] stands for the indicator function of the condition f (g, z) = f (g, z ).Let Γ g := Γ • ∆ g .We will show an upper bound on Γ g .We decompose Γ g in an "upper-triangular" and a "lower-triangular" part: Hence by the triangle inequality we have It thus suffices to upper bound Γ U g .Notice that as (10) shows, Γ U g [z, z ] only depends on the values f (g, z), f (g, z ).Since the range of f (g, • ) is [n + 1], we can think of Γ U g as an (n + 1) × (n + 1) block-matrix, where the blocks are determined by the values of f (g, z) and f (g, z ), and within a block all matrix elements are the same.Also observe that for all k ∈ [n] there are 2 n−k bitstrings y ∈ {0, 1} n such that f (g, y) = k, which tells us the sizes of the blocks are 2 n−k × 2 n−k .Motivated by these observations we define an orthonormal set of vectors in R 2 n by v n+1 := e g , and for all k ∈ [n] Since the row and column spaces of Γ U g are spanned by {v k : k ∈ [n + 1]}, we can reduce Γ U g to an (n + 1) × (n + 1)-dimensional matrix G: It follows from the above identity, together with the orthonormality of {v 1 , . . ., v n , v n+1 }, that G ∈ R (n+1)×(n+1) is a strictly upper-triangular matrix, with the following entries for k, ∈ [n]: By Equation (10) this is further equal to .
. This G d is only non-zero on one non-main diagonal (namely the (k, )-entries where d = − k), and its non-zero entries are all upper bounded by √ 2 2 n 2 − d 2 .We have G = n d=1 G d and therefore Inequalities (11)-(13) give that Γ g ≤ 2 n+3 and hence (9) yields a lower bound of Ω n2 n 2 n+3 = Ω(n) on the number of quantum queries to O z needed to learn z.
Theorem 27.Finding a point in B ∞ (K, 1/7) for an unknown convex set K such that K ⊆ B ∞ (0, 2) ⊆ R n requires Ω(n) quantum queries to a separation oracle SEP 0,0 (K), even if we are promised there exists some unknown x ∈ R n such that B ∞ (x, 1/3) ⊆ K.
Proof.We will prove an Ω(n) quantum query lower bound for this problem by a reduction from learning with first-difference queries.Let z ∈ {0, 1} n be an unknown binary string, and let us define K z := B ∞ (z, 1/3) ⊂ R n as a small box around the corner of the hypercube corresponding to z.Then clearly K z ⊂ B ∞ (0, 2), and finding a point close enough to K z is enough to recover z.
We can easily reduce a separation oracle query to a first-difference query to z, as follows.Suppose y is the vector for which we need to answer a SEP query: 1.If y is outside [−1/3, 4/3] n , then output a hyperplane separating y from [−1/3, 4/3] n .
2. If y is in [−1/3, 4/3] n , then let g be the nearest corner of the hypercube.
3. Let i be the result of a first-difference query to z with g.
(a) If i = n + 1, indicating that z = g, then we know K z exactly, so we can find a separating hyperplane or conclude that y ∈ K z .
(b) If z = g, then return e i if g i = 1, and −e i if g i = 0.
Hence our Ω(n) quantum lower bound on learning z with first-difference queries implies an Ω(n) lower bound on the number of quantum queries to a separation oracle needed for finding a point close to a convex set.
Since optimization over a set K gives a point close to the set K, this also implies a lower bound on the number of separation queries needed for optimization.This theorem is tight up to logarithmic factors, since it is known that O(n) classical separation queries suffice for optimization, even without knowing a point in the convex set [LSW15].Finally we remark that, due to our improved algorithm for optimization using validity queries (by combining Section 6 with Theorem 23), this also gives an Ω(n) lower bound on the number of separation queries needed to implement validity. 16

Consequences of convex polarity
Here we justify the central symmetry of Figure 1 using the results of Grötschel, Lovász, and Schrijver [GLS88, Section 4.4].We first need to recall the definition and some basic properties of the polar K * of a set K ⊆ R n .This is the closed convex set defined as follows: 17 For the remainder of this section we assume that K is a closed convex set such that B(0, r) ⊆ K ⊆ B(0, R).
We will observe that for the polar K * of a set K the following holds: where MEM(K * ) ↔ VAL(K) means we can implement a weak validity oracle for K using a single query to a weak membership oracle for K * , and vice versa.Since VIOL(K) and OPT(K) are equivalent up to reductions that use Θ(1) queries (via binary search), this justifies the central symmetry of Figure 1, because it shows that algorithms that implement VIOL(K) given VAL(K) are equivalent to algorithms that implement SEP(K * ) given MEM(K * ), and similarly algorithms that implement SEP(K) given VIOL(K) are equivalent to algorithms that implement VIOL(K * ) given SEP(K * ).
Grötschel, Lovász, and Schrijver [GLS88, Section 4.4] showed that the weak membership problem for K * can be solved using a single query to a weak validity oracle for K, and that the weak separation problem for K * can be solved using a single query to a weak violation oracle for K. Using similar arguments one can show the reverse directions as well, which justifies (14).Here we only motivate the equivalences between the above-mentioned weak oracles by showing the equivalence of the strong oracles (i.e., where ρ and ε are 0).
Strong membership on K * is equivalent to strong validity on K. First, for a given vector c ∈ R n and a γ > 0 observe the following: Hence, a strong membership query to K * with a point c can be implemented by querying a strong validity oracle for K with the vector c and the value 1.Likewise, a strong validity query to K with a point c and value18 γ > 0 can be implemented using a strong membership query to K * with c/γ.
Strong separation on K * is equivalent to strong violation on K. To implement a strong separation query on K * for a vector y ∈ R n we do the following.Query the strong violation oracle for K with y and the value 1.If the answer is that y, x ≤ 1 for all x ∈ K, then y ∈ K * .If instead we are given a vector x ∈ K with y, x ≥ 1, then x separates y from K * (indeed, for all z ∈ K * , we have z, x ≤ 1 ≤ y, x ).
For the reverse direction, to implement a strong violation oracle for K on the vector c and value 18 γ > 0 we do the following.Query the strong separation oracle for K * with the point c/γ.If the answer is that c/γ ∈ K * then c, x ≤ γ for all x ∈ K.If instead we are given a non-zero vector y ∈ R n that satisfies c/γ, y ≥ z, y for all z ∈ K * , then ỹ = y/ c/γ, y will be a valid answer for the strong violation oracle for K. Indeed, we have ỹ ∈ K because z, ỹ ≤ 1 for all z ∈ K * and K = (K * ) * , and by construction c, ỹ = γ.

Discussion and future work
We mention several open problems for future work: • Our current implementation of an optimization query using O(n) quantum membership queries is quadratically better than the best known classical randomized algorithm, which uses roughly n 2 membership queries.However, to the best of our knowledge it is open whether this quadratic classical bound is optimal (a quadratic classical lower bound is known for deterministic algorithms [Yao75]).
• Can we improve our Ω( √ n) lower bound on the number of separation (or membership) queries needed to implement an optimization oracle when our algorithm knows a point in K? We conjecture that the correct bound is Θ(n), in which case knowing a point in K does not confer much benefit for query complexity.
• Are there interesting convex optimization problems where separation is much harder than membership for classical computers? 19Such problems would be good candidates for quantum speed-up in optimization in the real, non-oracle setting of time complexity.It is known that given a deterministic algorithm for function evaluation, an algorithm with roughly the same complexity can be constructed to compute the gradient of that function [GW08].Hence for strong, deterministic oracles, separation is not much harder than membership queries.This, however, still leaves weak / randomized / quantum membership oracles to be considered.
• The algorithms that give an O(n) upper bound on the number of separation queries for optimization (for example [LSW15, Theorem 42]) give the best theoretical results for many convex optimization problems.However, due to the large constants in these algorithms they are rarely used in a practical setting.A natural question is whether the algorithms used in practice lend themselves to quantum speed-ups.Very recent work by Kerenidis and Prakash [KP18] on quantum interior point methods is a first step in this direction.
A Quantum gradient computation using relational oracles In this appendix we extend the result of Corollary 15 to functions given by a relational input oracle.As a direct consequence this shows that the algorithm from Theorem 23 also works when the input is given as a relational membership oracle instead of a standard oracle.
Definition 28 (Unitary δ-approximator).Let X be a finite set and let Y denote a set of fixed-point b-bit numbers.Let f : X → Y be a function.We say that a relational quantum oracle U on X is a b-bit unitary δ-approximator of f if the valid answers for each x ∈ X differ at most δ from f (x) (i.Proof.The algorithm is the same as in the less general Corollary 15 presented in Section 3.2, we just need to analyze it a bit more carefully.The main idea is still to implement an approximate version of the phase oracle O : |x, 0, 0 → e 2πi M 3B f (x 0 +rx) |x, 0, 0 , and then use Jordan's gradient computation algorithm.We approximate O by first approximately computing f using U , then applying 20 a controlled phase operation cP acting as cP : |y → e 2πi M 3B y |y (where M = 3B 84πδ as in the proof of Corollary 15), and finally applying U † to approximately uncompute f .We can assume without loss of generality that our unitary δ-approximator is such that the probability of |f (x) − y| > δ is at most 1 1200 .If this is not the case, we can improve the success probability by querying U a few times and taking the median of the results.We can assume without loss of generality that our approximate phase oracle does not change the value of the input register.Otherwise we can just copy |x to another register, then apply our approximate phase oracle on the second copy, then (approximately) erase the second copy of |x using mod 2 bitwise addition with the first copy.Under this assumption by (15) we get that then we would get a gradient estimate where each individual coordinate has the required approximation quality with probability at least 2 3 .Equation ( 16) implies that if instead we use our approximate implementation of the phase oracle, U † (I ⊗ cP ⊗ I)U , then the outcome probability distribution changes by at most 1 16 in total variation distance.So one run of Jordan's algorithm using this approximate phase oracle still outputs a vector v ∈ R n such that As in the proof of Corollary 15, repeating the whole procedure O log( n ρ ) times, and taking the median of the resulting vectors coordinatewise, gives a gradient approximator g of the desired quality.The gate complexity analysis follows from [GAW19, Theorem 21], noting that each controlled phase operation cP can be implemented using O log B δ single-qubit phase gates.

Lemma 16 .
Let S ⊆ R n be such that S = −S, and let conv(S) denote the convex hull of S. If f : conv(S) → R is a convex function, f (0) = 0, and |f (s)| ≤ δ for all s ∈ S, then |f (s )| ≤ δ for all s ∈ conv(S).

Figure 2 :
Figure 2: Graphical example of the relation between h(y) and the distance from (y, 0) to the border in the −e n direction.
e., F(x) = {y ∈ Y : |f (x) − y| ≤ δ}), and the success probability is at least 2 3 .Corollary 29 (Gradient computation using a unitary δ-approximator).Let δ, B, r, c ∈ R, ρ ∈ (0, 1/3].Let x 0 , g ∈ R n with g ∞ ≤ B r .Let m := log 2 B 28πδand suppose f :(x 0 + rG n m ) → R is such that |f (x 0 + rx) − g,rx − c| ≤ δ for 99.9% of the points x ∈ G n m , and we have access to U , an O log B δ -bit unitary δ-approximator of f over the domain (x 0 + rG n m ).Then we can compute a vector g ∈ R n such that Pr g − g ∞ > 8 • 42πδ r ≤ ρ, with O log n ρ queries to U and U † and gate complexity O n log n ρ log B δ loglog n ρ loglog B δ .

2 .
Let us define F(x) := {y ∈ Y : |f (x) − y| ≤ δ} as in Definition 28.Observe that O|x, 0, 0 −U † (I ⊗ cP ⊗ I)U |x, 0, 0 2 = I ⊗ (e 2πi M 3B f (x 0 +rx) I − cP) ⊗ I U |x, 0, 0 2 = y∈Y e 2πi M 3B f (x 0 +rx) − e 2πi M 3B y α x,y |x, y, ψ x,yWe bound the above quantity in two parts using the triangle inequality as follows:y∈Y \F (x) e 2πi M 3B f (x 0 +rx) − e 2πi M 3B y α x,y |x, y, ψ x,y 2 ≤ y∈Y \F (x) |2α x,y | 2 ≤ 1 300 ; y∈F (x) e 2πi M 3B f (x 0 +rx) − e 2πi M 3B y α x,y |x, y, ψ x,y x 0 + rx) − y)α x,yThus for all x ∈ G n m we have that O|x, 0, 0 − U † (I ⊗ cP ⊗ I)U |x, 0 O|ψ − U † (I ⊗ cP ⊗ I)U |ψ < 1 16 , for any quantum state |ψ = x∈G n m α x |x, 0, 0 .(16)From now on the proof is the same as the proof of Corollary 15.In that proof we showed that if we use the phase oracle O in Jordan's gradient computation algorithm,20 If y is a b-bit fixed-point binary number, then this can be implemented using b single-qubit phase gates as follows: we can assume without loss of generality that y = a0 + a • b j=1 yj2 j for some fixed a0, a ∈ R.Then e 2πi M 3B y = e 2πi M 3B a 0 b j=1 e 2πi M 3B ay j 2 j.The global phase is irrelevant, and the other phase factors can be implemented by using b single-qubit phase gates, each acting as |yj → e 2πi M 3B ay j 2 j |yj .
As described above the corollary, we first implement a phase oracle for f and then we apply Jordan's gradient computation algorithm (Lemma 14).With a single query to U and its inverse we can implement a phase oracle O that acts as O : |x → e 2πi M 3B f (x 0 +rx) |x , where M := 3B 84πδ , and 5 with O log n ρ queries to U and U † and gate complexity O n log n ρ log B δ loglog n ρ loglog B δ .Proof.