Quantum Speedup Based on Classical Decision Trees

Lin and Lin [LL16] have recently shown how starting with a classical query algorithm (decision tree) for a function, we may ﬁnd upper bounds on its quantum query complexity. More precisely, they have shown that given a decision tree for a function f : { 0 , 1 } n → [ m ] whose input can be accessed via queries to its bits, and a guessing algorithm that predicts answers to the queries, there is a quantum query algorithm for f which makes at most O ( √ GT ) quantum queries where T is the depth of the decision tree and G is the maximum number of mistakes of the guessing algorithm. In this paper we give a simple proof of and generalize this result for functions f : [ (cid:96) ] n → [ m ] with non-binary input as well as output alphabets. Our main tool for this generalization is non-binary span program which has recently been developed for non-binary functions, as well as the dual adversary bound. As applications of our main result we present several quantum query upper bounds, some of which are new. In particular, we show that topological sorting of vertices of a directed graph G can be done with O ( n 3 / 2 ) quantum queries in the adjacency matrix model. Also, we show that the quantum query complexity of the maximum bipartite matching is upper bounded by O ( n 3 / 4 √ m + n ) in the adjacency list model.


Introduction
Query complexity of a function f : [ ] n → [m] is the minimum number of adaptive queries to its input bits required to compute the output of the function. In a quantum query algorithm we allow to make queries in superposition, which sometimes improves the query complexity, e.g., in Grover's search algorithm [Gro].
Lin and Lin [LL16] have recently shown that surprisingly sometimes classical query algorithms may result in improved quantum query algorithms. They showed that having a classical query algorithm with query complexity T for some function f : {0, 1} n → [m], together with a guessing algorithm that at each step predicts the value of the queried bit and makes no more than G mistakes, the quantum query complexity of f is at most Q(f ) = O( √ GT ). For instance, the trivial classical algorithm for the search problem which queries the input bits one by one have query complexity T = n, and the guessing algorithm which always predicts the output 0 makes at most G = 1 mistakes (because making a mistake is equivalent to finding an input bit 1 which solves the search problem). Thus the quantum query complexity of the search problem is O( √ GT ) = O( √ n) recovering Grover's result. There are two proofs of the above result in [LL16]. One of the proofs is based on the notion of bomb query complexity B(f ). Lin and Lin show that there exists a bomb query algorithm that computes f using O(GT ) queries, and that the bomb query complexity equals the square of the quantum query complexity, i.e., B(f ) = Θ(Q(f ) 2 ), which together give Q(f ) = O( √ GT ). In the second proof, they design a quantum query algorithm with query complexity O( √ T G) for f using Grover's search; in computing the function they use the values of predicted queries instead of the real values and use a modified version of Grover's search to find mistakes of the guessing algorithm.
Our results: In this paper we give a simple proof of the above result based on the method of non-binary span program that has recently been development by the authors [BT19]. Then inspired by this proof, we generalize Lin and Lin's result for functions f : [ ] n → [m] with non-binary input as well as non-binary output alphabets. Our proof of this generalization is based on the dual adversary bound which is another equivalent characterization of the quantum query complexity [LMR + 11].
As an application of our main result we show that given query access to edges of a directed and acyclic graph G in the adjacency matrix model, the vertices of G can be sorted with O(n 3/2 ) quantum queries to its edges. We also show that given a directed graph G and a vertex v ∈ V (G), the quantum query complexity of determining the length of the smallest directed cycle in G containing v is Θ(n 3/2 ). Moreover, we show that given an undirected graph G, a vertex v and some constant k > 0, the quantum query complexity of deciding whether G has a cycle of length k containing the vertex v is O(n 3/2 ). Furthermore, we show that some existing results on the quantum query complexity of graph theoretic problems such as directed st-connectivity, detecting bipartite graphs, finding strongly connected components, and deciding forests can easily be derived from our results.
Our main result is also useful when dealing with graph problems in the adjacency list model. In this regard, we show that given query access to the adjacency list of an unweighted bipartite graph G, the quantum query complexity of finding a maximum bipartite matching in G is O(n 3/4 √ m + n), where m is the number of edges of the graph. To the authors' knowledge this is the first non-trivial upper bound for this problem.

Preliminaries
In this section we review the notions of the dual adversary bound and the non-binary span program that will be used for the proof of our main result. In this paper we use Dirac's ket-bra notation, so |v is a complex (column) vector whose conjugate transpose is denotes by v|. Then, v|w is the inner product of vectors |v , |w . For a matrix A, we denote by A the operator norm of A, i.e., the maximum singular value of A. We use [ ] to denote the -element set {0, . . . , − 1}. We also use the Kronecker delta symbol δ a,b which equals 1 if a = b and equals 0 otherwise.

Query algorithms
In the query model we deal with the problem of computing a function f : with domain D f ⊆ [ ] n by quering coordinates of the input x = (x 1 , . . . , x n ) ∈ D f ⊆ [ ] n . In the classical setting a query algorithm asks the value of some coordinate of the input and based on the answer to that query decides what to do next: either asks another query or outputs the result. Such an algorithm can be modeled by a decision tree whose internal vertices are associated with queries, i.e., indices 1 ≤ j ≤ n, and whose edges correspond to answers to queries, i.e., elements of [ ]. At each vertex the algorithm queries the associated index, and then moves to the next vertex via the edge whose label equals the answer to that query. The algorithm ends once we reach the leaves of the tree that are labeled by elements of [m], the output set of the function. The query complexity of the algorithm is the maximum number of queries in the algorithm over all x ∈ D f , which is equal to the height of the decision tree. A randomized classical query algorithm can similarly be modeled by a collection of decision trees where one of them is chosen at random.
In contrast in quantum query algorithms, a query can be made in superposition. Such a query to an input x can be modeled by the unitary operator O x : where the first register contains the query index 1 ≤ j ≤ n, and the second register saves the value of x j in a reversible manner. Therefore, a quantum query algorithm for computing f (x) is an alternation of unitaries O x and some U i 's that are independent of x (but depend on f itself). Indeed, a quantum query algorithm consists of sequence of unitaries followed by a measurement which determines the outcome of the algorithm. We say that an algorithm computes f , if for every x ∈ D f ⊆ [ ] n the algorithm outputs f (x) with probability at least 2/3. The query complexity of such an algorithm is the number of queries, i.e., the number of O x 's in the sequence of unitaries. Q(f ) denotes the quantum query complexity of f , which is the minimum query complexity among all quantum algorithms that compute f .

Dual adversary bound
The generalized adversary bound [HLŠ07] gives a lower bound on the quantum query complexity of any function f : This bound can be obtained via a semi-definite program (SDP) whose optimal value, based on the duality of SDPs, has been shown to be equal to that of the following SDP up to a factor of at most 2 [LMRŠ].
min max Here the optimization is over vectors |u xj , |w xj . This SDP is called the dual adversary bound and is proved by Lee et al. [LMR + 11] to be an upper bound on quantum query complexity of the function f as well. Thus, the above SDP characterizes the quantum query complexity of f up to a constant factor. Moreover, in order to design quantum query algorithms and quantum query complexity upper bounds, it is enough to find a feasible solution of the SDP (1). Function evaluation is a special case of a more general problem called state generation [Shi02,AMRR11]. In the state generation problem, the goal is to generate a state |ψ x (which depends on x) up to a constant error, given query access to x ∈ D. That is, the quantum query algorithm is required to output some state ρ x such that ρ x − |ψ x ψ x | tr ≤ 0.1 where · tr denotes the trace distance. Of course, the function evaluation problem is a special case of the state generation problem in which |ψ x = |f (x) . It has been shown in [LMR + 11] that a generalization of the SDP (1) characterizes the quantum query complexity of the state generation problem up to a constant factor. This generalized SDP, again called the dual adversary bound, is as follows: where again the optimization is over vectors |u xj , |w xj . Observe that this SDP depends only on the gram matrix ψ x |ψ y x,y of the target vectors. Moreover, letting |ψ x = |f (x) , for some function f , we recover (1).

Non-binary span program
Span program introduced by [Rei09] is another algebraic tool that similar to the dual adversary bound, characterizes the quantum query complexity of binary functions up to a constant factor. This model has been used for designing quantum query algorithms of binary decision functions byŠpalek and Reichardt [RŠ12]. The notion of span program was generalized for functions with non-binary inputs in [IJ15]. Later, it was further generalized for arbitrary non-binary functions with non-binary input/output alphabets [BT19]. In this paper we use a special form of non-binary span program of [BT19] called non-binary span program with orthogonal inputs, which characterizes the quantum query complexity of any functions f : up to a factor of √ − 1. Here since we will use non-binary span programs only for functions with binary inputs ( = 2), we may focus on this special form.
Then I ⊆ V is defined by and for every x ∈ D f the set of available vectors I(x) is defined by Indeed, when the j-th coordinate of x is equal to q (i.e., x j = q) then the vectors in I j,q become available. We also let A be the d × |I| matrix consisting of all input vectors as its columns where d = dim V .
We say that P evaluates the function f if for every x ∈ D f , |t α belongs to the span of the available vectors I(x) if and only if α = f (x). Even more, there should be two witnesses indicating this. Namely, a positive witness |w x ∈ C |I| and a negative witness |w x ∈ V satisfying the following conditions: • The coordinates of |w x associated to unavailable vectors are zero.
Let positive and negative complexities of P together with the collections w and w of positive and negative witnesses (P, w,w) be Then the complexity of (P, w,w) is equal to It is shown in [BT19] that for any NBSPwOI evaluating the function f , its complexity wsize(P, w,w) is an upper bound on Q(f ). Furthermore, there always exists an associated NBSPwOI whose complexity is bounded by O( √ − 1Q(f )). Thus, NBSPwOIs characterize the quantum query complexity of all functions up to a factor of √ − 1.

From decision trees to span programs
In this section we first give a simple proof of the main result of [LL16] based on span programs. Later, getting intuition from this proof, we generalize this result for non-binary functions.
Recall that a classical query algorithm for a function f : 1} n can be modeled by a binary decision tree T with internal vertices being indexed by elements of {1, . . . , n}, edges being indexed by {0, 1}, and leaves being index by elements of [m]. The depth of the decision tree, which we denote by T , is the classical query complexity of this decision tree. In [LL16] it is assumed that there is a further algorithm that predicts the values of the queried bits. That is, at each internal vertex of T it makes a guess for the answer of the associated query. This guess, of course, may depend on the answers to the previous queries. Then it is proven that if for every x ∈ D f the number of mistakes of the guessing algorithm is at most G, then the quantum query complexity of f is O( √ T G). We can visualize the guessing algorithm in the decision tree by coloring its edges. For each internal vertex of the decision tree, there are two outgoing edges indexed by 0 and 1, one of which is chosen by the guessing algorithm. We color the chosen one black, and the other one red. We call such a coloring of the edges of the decision tree a guessing-coloring (hereafter, G-coloring). Now once we make a query at an internal vertex, its answer tells us which edge we should take, the black one or the red one. If it was black it means that the guessing algorithm made a correct guess, and if it was red it means that it made a mistake. Therefore, the number of mistakes of the guessing algorithm for every x ∈ D f equals the number of red edges in the path from the root to the leaf of the tree associated to x.
Here we summarize the notion of G-coloring.
Definition 1 (G-coloring). A G-coloring of a decision tree T is a coloring of its edges by two colors black and red, in such a way that any vertex of T has at most one outgoing edge with black color.
We can now state the result of [LL16] based on decision trees and the notion of G-coloring.
Theorem 2 (Lin and Lin [LL16]). Assume that we have a decision tree T for a function f : 1} n whose depth is T . Furthermore, assume that for a G-coloring of the edges of T , the number of red edges in each path from the root to the leaves of T is at most G. Then there exists a quantum query algorithm computing the function f with query complexity O( √ GT ).
We remark that the result of [LL16] also works for randomized algorithms. Nevertheless, here to present our main ideas we first consider deterministic decision trees. Later, randomized query algorithms will be considered as well.
To prove this theorem we design an NBSPwOI for f with complexity O( √ GT ). To present this span program first we need to develop some notations. Let V (T ) be the vertex set of T . Then for every internal vertex v ∈ V (T ), its associated index is denoted by J(v), i.e., J(v) is the index 1 ≤ j ≤ n that is queried by the classical algorithm at node v. The two outgoing edges of v are indexed by elements of {0, 1} and connect v to two other vertices. We denote these vertices by N (v, 0) and N (v, 1). That is, N (v, q), for q ∈ {0, 1}, is the next vertex that is reached from v after following the outgoing edge with label q. We also represent the G-coloring of edges of T by a function C(v, q) ∈ {black, red} where v is an internal vertex, q ∈ {0, 1} and C(v, q) is the color of the outgoing edge of v with label q.
Proof. For every x ∈ D f there is an associate leaf of the tree T that is reached once we follow edges of the tree with labels x j starting from the root. In order to find f (x) it suffices to find this associated leaf because this is what the classical query algorithm does; once we find the leaf associated to x, we find the path that the classical query algorithm would take and then find f (x). Thus in order to compute f , we may compute another functionf which given x outputs its associated leaf of T , and to prove the upper bound of O( √ GT ) on the quantum query complexity it suffices to design an NBSPwOI forf with this complexity.
The NBSPwOI is the following: • the vector space V is determined by the orthonormal basis indexed by vertices of T : • the input vectors are where W black and W red are positive real numbers to be determined, • the target vectors are indexed by leaves u of the tree: where r ∈ V (T ) is the root of the tree.
For every vertex v of T we denote by P v the (unique) path from the root r to vertex v. Then for every x ∈ D f there exists a path P x = Pf (x) from the root of the decision tree to the leaff (x). Thus the target vector tf (x) equals where the vectors in the braces are all available for x. Then since by assumptions the number of red edges along the path P x is at most G and the number of all edges is at most T , the positive complexity is bounded by We let the negative witness for x to be It is easy to verify that |w x is orthogonal to all available vectors, and that w x | t u = w x | r = 1 for all u =f (x). Thus |w x is a valid negative witness. Moreover, an input vector of the form contributes in the negative witness size only if its corresponding edge {v, N (v, q)} leaves the path P x , i.e., they have only the vertex v in common. In this case the contribution would be equal to W C(v,q) , the weight of that edge. The number of such red (black) edges equals the number of black (red) edges in P x , which is bounded by T (G). Therefore, the negative witness size is

Main result: generalization to the non-binary case
This section contains our main result which is a generalization of Theorem 2 for In this case, a classical query algorithm corresponds to a decision tree whose internal vertices have out-degree (instead of 2). Moreover, a G-coloring can be defined similarly based on a guessing algorithm. Yet, we are interested in a further generalization of the notion of decision tree which we explain by an example. Consider the following trivial algorithm for finding the minimum of a list of numbers in [ ]: we keep a candidate minimum, and as we query the numbers in the list one by one, we update it once we reach a smaller number. In this algorithm, the possible numbers as answers to a query are of two types: numbers that are greater than or equal to the current candidate minimum, and those that are smaller. Now assuming that the answer to that query is of the first type, what we do next is independent of its exact value (since we simply ignore it and query the next index). Considering the associated decision tree T , for each vertex v we have a candidate minimum, and the outgoing edges of v are labeled by different numbers in [ ]. Then by the above discussion, the subtrees of T hanging below the outgoing edges whose labels are greater than or equal to the current candidate minimum are identical. Thus we can identify those edges and their associated subtrees. In this case the outgoing edges of v are not labeled by elements of [ ], but by its certain subsets that form a partition. Indeed, there is an outgoing edge whose label is the subset of numbers greater than or equal to the current candidate minimum, and an outgoing edge for any smaller number. Motivated by the above example of minimum finding, we generalize the notion of decision tree T for a function f : D f → [m] with non-binary input alphabet (D f ⊆ [ ] n ). As before each internal vertex v of T corresponds to a query index 1 ≤ J(v) ≤ n. Each outgoing edge of this vertex is labeled by a subset of [ ], and we assume that these subsets form a partition of [ ]. We denote this partition by Figure 1 for an example of a decision tree. Now given a decision tree T as above, the corresponding classical algorithm works as follows. We start with the root r of the tree and query J(r). Then x J(r) ∈ [ ] corresponds to the outgoing edge of v with label Q v (x J(r) ). We take that edge and move to the next vertex N (v, Q v (x J(r) )). We continue until we reach a leaf of the tree which determines the value of f (x).
The notation of G-coloring can also be generalized similarly. Recall that a Gcoloring comes from a guessing algorithm that in each step predicts the answer to the queried index. In our generalized decision tree whose edges are labeled by subsets of [ ], we assume that the guessing algorithm chooses one of these subsets as its guess. Rephrasing this in terms of colors, we assume that for each internal vertex v of T , one of its outgoing edges is colored in black (meaning that its label is the predicted answer) and its other outgoing edges are colored in red. We denote the color of the Here is a summary of the notions of generalized decision tree and G-coloring explained above.

Definition 3 (Generalized decision tree and G-coloring). A generalized decision
We say that T decides a function f :

by starting from the root of T and following edges labeled by
As in Definition 1, a G-coloring of a generalized decision tree T is a coloring of its edges by two colors black and red, in such a way that any vertex of T has at most one outgoing edge with black color.
We also consider randomized classical query algorithms. In this case, for each value ζ of the outcomes of some coin tosses, we have a (deterministic) generalized decision tree T ζ as above. We also assume that each of these decision trees T ζ is equipped with a guessing algorithm which itself may be randomized. Nevertheless, we may assume with no loss of generality that ζ includes the randomness of the guessing algorithm as well. Therefore, for any ζ we have a generalized decision tree with a G-coloring as before. We assume that the classical randomized query algorithm outputs the correct answer f (x) with high probability: The complexity of such a randomized query algorithm is given by the expectation of the number of queries over the random choice of ζ.
We can now state our generalization of Theorem 2. (ii) Let {T ζ : ζ} be a set of generalized decision trees corresponding to a randomized classical query algorithm evaluating f with bounded error as in (5). Moreover, suppose that each T ζ is equipped with a G-coloring. Let P ζ x be the path from the root to the leaf of T ζ associated to x ∈ D f . Let T ζ x be the length of the path P ζ x , and let G ζ x be the number of red edges in this path. Define where the expectation is over the random choice of ζ. Then the quantum query The span program in the proof of Theorem 2 can easily be adapted for a proof of the above theorem, yet in the complexity of the resulting span program we see an extra factor of √ − 1, i.e., we get the upper bound of O( ( − 1)GT ) on the quantum query complexity. To remove this undesirable factor, getting ideas from the span program in the proof of Theorem 2, we directly construct a feasible solution of the dual adversary SDP (1). Indeed, our starting point for proving Theorem 4 is the proof of Theorem 2 based on span programs. Then getting intuition from this proof, we design a feasible solution of the dual adversary SDP with the desired objective value.
Proof. (i) Let V j (T ) be the set of vertices of T associated with query index j, i.e., V j (T ) = J −1 (j). Also let P x be the path from the root r to the leaf of T associated to x ∈ D f . We can assume with no loss of generality that V j (T ) ∩ P x contains at most one vertex since otherwise in computing f (x) we are querying index j more than once.
To construct the feasible solution of the dual adversary SDP we will need the set of vectors [LMRŠ]: where θ = 1 2 − √ 2 −1 2 . These vectors have the property that |µ Q Also we use the set of vectors {|μ α : α ∈ [m]} and {|ν α : α ∈ [m]} in C m defined similarly as above with the property that |μ α 2 = |ν α 2 = 2(m−1) m ≤ 2 for all α, and that μ α |ν β = 1 − δ α,β . Now define vectors |u xj and |w xj in the vector space C V (T ) ⊗C {black,red} ⊗C 2 [ ] ⊗ C m as follows: Observe that assuming there is a (unique) vertex v ∈ P x ∩ V j (T ), |u xj is defined in terms of the label and color of the outgoing edge of v with label Q v (x j ). Moreover, |w xj is equal to either We claim that these vectors form a solution of the SDP (1). For every Moreover, for any j = J(v), we have u xj | w yj = 0 since for such j's either one of |u xj , |w yj is zero, or these vectors correspond to different vertices, or they corre- Therefore, the vectors |u xj and |w xj form a feasible solution of the dual adversary SDP. Now we compute the objective value. By assumption there are at most T edges in P x with black color, and at most G red edges in P x . Also the norm-squared of |µ Q 's and |μ α 's are bounded by 2. Therefore, Also, in computing n j=1 |w xj 2 , for every vertex v ∈ P x , if C(v, Q v (x J(v) )) = black we get a term of 4W red , and if C(v, Q v (x J(v) )) = red we get a contribution of 4(W black + W red ). Now having a bound on the number of black and red edges in P x we find that Therefore, if we let W black = 1 W red = T G , then the objective value of the SDP (1) will be O( √ GT ).
(ii) Let f ζ : D f → [m] be the function that is computed by the decision tree T ζ . Then by assumption we have On the other hand, by part (i) for every ζ there is a feasible solution u ζ xj and w ζ xj of the dual adversary SDP for f ζ with Let us define and where K is the number of possible values that ζ takes. Then we have Now define and consider the state generation problem for these vectors. Observe that Therefore, by (12) where in the last inequality we use (9). We conclude that there is a quantum query algorithm which makes O(M ) quantum queries and outputs f (x) with probability at least 0.8. Thus we only need to bound M , the objective value of the dual adversary bound. We compute n j=1 |u xj Then as before letting W black = 1 W red = T G , we find that the objective value of this feasible solution is bounded by M = O( √ GT ). We are done.
In the proof of Theorem 4 we assigned two different weights to edges of a decision tree based on their colors; the weight of any red edge is W red and the weight of any black edge is W black . One may suggest that by assigning different wights to edges of T we may get better bounds. That is, for any internal vertex v of T , we may choose two weights W v,black , W v,red and assign them to the outgoing edges of v with the corresponding colors. Then the proof of Theorem 4 can be adopted to get a bound of the form O(max x,y M + x M − y ) on the quantum query complexity where Then a simple application of the Cauchy-Schwartz inequality and max x, show that updating the weights by . As a result, with no loss of generality we may assume that Nevertheless, we still have the freedom to choose different weights for vertices of the decision tree T . These weights could depend on some parameter of the state of algorithm (decision tree) that is updated as we proceed. Moreover, it could depend on the guessing algorithm, e.g., on the number of red edges we have seen so far. In the following theorem, we analyze the latter option, and leave further investigation of this idea for future works.
Theorem 5. Let {T ζ : ζ} be a set of generalized decision trees corresponding to a randomized classical query algorithm evaluating f with bounded error as in (5).
Moreover, suppose that each T ζ is equipped with a G-coloring. Let P ζ x be the path from the root to the leaf of T ζ associated to x ∈ D f . Let G ζ x be the number of red edges in P ζ x , and for 1 ≤ g ≤ G ζ x , let T ζ g,x be the number of black edges in P ζ x after the g-th red edge and before the next red one. Also let T ζ 0,x be the number of black edges before the first red edge in P ζ x , and let T ζ g, where the expectation is over the random choice of ζ. Then the quantum query complexity of f is Proof. The proof is similar to the proof of Theorem 4 except that we pick different weights for edges of the decision trees. Using the notations we used before, for any choice of ζ and its associated decision tree T ζ define and where as before C v,xj is given by (8), and g(v) is the number of red edges in the path from the root of T ζ to v. Moreover, W g,black , W g,red , for any g ≥ 0, are positive weights to be determined. As before, these vectors form a feasible solution of the SDP(1) for the function f ζ . Then we define vectors |u xj , |w xj and |ψ x as in (10), (11) and (13). As before, we obtain a feasible solution to the SDP (2) whose objective value is an upper bound on the quantum query complexity of f . We estimate the objective value as follows.
Then letting W 0 = T 0 and W g = T g for g ≥ 1 we obtain 2 n j=1 |u xj We similarly obtain the same upper bound on n j=1 |w xj 2 . Then the quantum

Applications
We can use our main result, Theorem 4, to simplify the proof of some known quantum query complexity bounds as well as to derive new bounds. We start with some simple examples.

(i) [counting] The quantum query complexity of finding all input indices with values equal to
(ii) [k-threshold] The quantum query complexity of deciding whether j : It is shown that the quantum query complexity of counting equals Θ( √ rn) [BHT]. Also it is well-known that the k-threshold problem has quantum query complexity O( √ kn).
Proof. (i) In order to use Theorem 4 we first need a classical query algorithm. Suppose that we start from the first index and query all the indices one by one. We then output the set of indices j with x j = q. Next we need a G-coloring. To this end, observe that the algorithm is ignorant of the exact value of some index x j once it makes sure that x j = q. Thus is the associated decision tree T we can unify all outgoing edges of a vertex with label q = q. That is, in T there are two outgoing edges for any vertex that are labeled by {q} and [ ] \ {q}. Now we color all edges with label {q} red and color the edges with label [ ] \ {q} black. In this coloring there are at most r red edges in any path from the root to leaves: G = r. The depth of the decision tree is T = n. As a result the quantum query complexity of quantum counting is O( √ rn).
(ii) The proof is similar to that of part (i). In the classical algorithm we query indices one by one until we find k indices j with x j = q. Then in T we unify edges with label q = q and color them black, and color edges with label {q} red. As the algorithm stops once it faces k indices with value q, the number of red edges in any path in T from the root to leaves is at most G = k. Also the depth of the tree is T = n. Therefore the quantum query complexity of the threshold problem is O( √ kn). ]. Yet, we would like to present these results since they show how randomization (part (ii) of Theorem 4) may help to improve upper bounds on the quantum query complexity.
Second, observe that a list of numbers may have several minimums, so the problems in this proposition are not really function problems. To turn them into functions we may assume that our goal is to find the minimum number in the list whose index is also minimum. In other words, we consider a new order " ≺ " such that x i ≺ x j if x i < x j , or if x i = x j and i < j. Now the minimum in this order is unique and we may ask for finding it.
Proof. (i) Consider the randomized classical algorithm that queries all indices one by one in a random order. The algorithm keeps a candidate for minimum at each step, and updates it once it reaches a smaller number. Observe that this algorithm is ignorant of the exact answer to a query once it makes sure that it is not smaller than the current candidate for minimum. Thus in the associated decision tree (for any choice of random order ζ), at any internal vertex v we can unify outgoing edges with label in {q : q ≥ m v } where m v is the candidate for minimum at node v. Thus in T ζ any internal vertex v has an outgoing edge with label {q : q ≥ m v } and an outgoing edge for any other q < m v . The former edge is colored black and the latter edges are colored red. The depth of T ζ equals T = n for any ζ. However, for a given x, G ζ x depends on ζ, so we should compute We claim that G = O(log n). Intuitively speaking, the expected number of x j 's that are smaller than the first queried element is n/2, and the guessing algorithm does not make mistakes once we query such x j 's. Thus, after the first query, in expectation, half of the x j 's would become irrelevant in computing G. Repeating this argument, we obtain G = O(log n). Below we present a more precise argument for this claim. We can assume with no loss of generality that x 1 < · · · < x n , since in the beginning of the algorithm we apply a random permutation. If in the random permutation ζ = (ζ(1), . . . , ζ(n)) the first element is n, i.e., ζ(1) = n, then G ζ n = G ζ n−1 + 1 where ζ = (ζ(2), . . . , ζ(n)). Otherwise, if ζ(1) = n then G ζ n = G ζ n−1 where ζ is the same order as ζ from which n is removed. We conclude that Therefore, letting G n = E[G ζ n ] we have G n = G n−1 + 1 n .
Using G 1 = 1 we obtain As a result, G = O(log n) and by Theorem 4 the quantum query complexity of finding the minimum is bounded by O( √ n log n).
(ii) The proof is similar to that of part (i). Again we read the numbers in a random order and update a k-list as our candidate for S as we reach a number that is smaller than all the number in the list. The associated decision tree and its G-coloring is as before. Again we would have T = n. Also by similar ideas as in the proof of part (i) it can be shown that G n = G n−1 + k/n because with probability k/n the largest x j appears in the first k numbers in a random permutation. Therefore, We conclude that the quantum query complexity of finding the k smallest numbers is bounded by O( √ kn log n).
Motivated by Proposition 6 we can state the following general upper bound on the quantum query complexity of functions. Proof. We prove this corollary using Theorem 4. Given the classical algorithm for f , for a G-coloring of the edges of the associated decision tree, color every edge of the decision tree with label q 0 black and the rest of the edges red, where q 0 is such that g = max x∈D f r q 0 (x). Then since each x ∈ D f contains at most g indices with values q 0 , in every path from the root to leaves of the decision tree we see at most G = g red edges. Then the quantum quantum query complexity of f is

Graph properties in the adjacency matrix model
In this subsection and the following one we use Theorem 4 to prove quantum query complexity upper bounds on some graph theoretic problems. In this subsection, we assume that the graph is given in the adjacency matrix model, by which we mean that the queries are from the entries of the adjacency matrix of the graph. That is, given vertices u, v of the graph, we may ask whether there is an edge between u and v or not. Sometimes we assume that the underlying graph is directed in which case we ask whether there is a directed edge from u to v. Inspired by the ideas in [LL16], we make use of the well-known Breadth First Search algorithm (BFS, see Algorithm 1) as our starting point for designing classical algorithms for some graph theoretic problems. The point of the BFS algorithm is that it returns a spanning tree (forest), with at most n − 1 edges, of the underlying graph. Thus if we always guess that there is no edge between two queried vertices, we make at most n − 1 mistakes.

Algorithm 1 BFS(G): breadth first search algorithm on graph G
1: Let L be a list of unprocessed vertices and Q be a first in first out queue.  (ii) [cycle detection] The quantum query complexity of deciding whether G is a forest or has a cycle is O(n 3/2 ).
(iii) [directed st-connectivity] The quantum query complexity of finding a shortest path (the path that consists of the least number of edges) between two vertices s and t in G is O(n 3/2 ). This holds for either directed or undirected graphs.
(iv) [smallest cycles containing a vertex] The quantum query complexity of finding the length of the smallest directed cycle containing a given vertex v in a directed graph G is Θ(n 3/2 ).
(v) [k-cycle containing a vertex] The quantum query complexity of deciding whether G has a cycle of length k, for a fixed k, containing a given vertex v is O((2k) (k−1) n 3/2 ).
The problem of bipartiteness has been first shown in [Āri15] to have quantum query complexity O(n 3/2 ), which is shown to be tight in [Zha05]. An algorithm for the problem of cycle detection with O(n 3/2 ) queries is proposed in [CMB] that works by reducing the problem to the st-connectivity problem. This upper bound is known to be tight [CK]. For the directed st-connectivity problem, it has been first shown to have query complexity Θ(n 3/2 ) in [DHHM06]. There exists a quantum query algorithm for deciding whether G contains a cycle of length less than k containing a given vertex v with query complexity O(n √ k) [CMB]. For a list of related algorithms on cycle detection consult [Cir06].
We would like to remark that the space complexity of all BFS/DFS-based quantum query algorithms in this subsection and the next one are linear in the size of the input graph. This is because our algorithms are based on feasible solutions of the dual adversary SDP that are obtained from a generalized decision tree. Now the point is that the space complexity of such an algorithm equals the logarithm of the dimension of the vectors in the feasible solution of the dual adversary SDP, that itself equals the size of the decision tree which is exponential.
Proof. (i) A graph G is bipartite iff its vertices can be properly colored with two colors blue and green (such that no two adjacent vertices have the same color).
Here is a classical algorithm to solve bipartiteness. We run the BFS algorithm (Algorithm 1) that outputs a spanning forest S of G. Then we color every vertex of G with odd depth in S blue, and every vertex of G with even depth in S green. After this coloring, we search for an edge between two vertices with the same color in G. If no such edge exists, then G is bipartite.
In order to use Theorem 4, in the associated decision tree T of the above algorithm, color every outgoing edge of T with label 1 red, and the rest of edges black. The depth of the decision tree is T ≤ n 2 as the total number of possible queries (possible edges) for G is n(n − 1)/2. Also, by the above coloring of edges of T , we see at most n red edges in every path from the root to leaves of T . Indeed, we see at most n − 1 red edges once we build the spanning forest S, and at most 1 red edge once we search for an edge in G between vertices with the same parity depths. Thus G ≤ n and the quantum query complexity of bipartiteness is at most O( √ GT ) = O(n 3/2 ).
(ii) In a classical algorithm for this problem we first build a BFS forest and then search for an edge in the whole graph that does not belong to the BFS forest. If such an edge exists it should belong to a cycle in G. In order to use Theorem 4, in the associated decision tree T , as before, we color every edge of T with label 0 black, and edges with label 1 by red. The depth of the decision tree is T ≤ n 2 , and using this coloring in every path from the root to leaves of the decision tree there are at most G = n red edges. Therefore, the quantum query complexity of the cycle detection problem is O(n 3/2 ).
(iii) Again we run the BFS algorithm on G starting from vertex s to build a subtree S of G with root s. Then a shortest path from s to t, if exists, belongs to S, and can be found once we have S. The depth of the associated decision tree is T = n 2 . For the G-coloring, as before, we color every edge with label 0 black and other edges red to get G = n. Then the quantum query complexity of directed st-connectivity is O( (iv) In a classical algorithm for this problem we may run the BFS algorithm starting from vertex v. In parallel, whenever we reach a new vertex u we query if there is an edge from u to v. Finding such an edge corresponds to a smallest cycle containing v.
As previous examples for a G-coloring of the associated decision tree, we color every edge with label 0 black and other edges red, then we have G = n and T = n 2 . Therefore, the quantum query complexity of deciding whether G has a cycle containing v is O(n 3/2 ). To prove the optimality of this bound we reduce the problem of directed stconnectivity which has query complexity Ω(n 3/2 ) to this problem. Assume that we are given a graph G and two distinguished vertices s, t ∈ V (G), and we want to decide whether s is connected to t by a directed path or not. To solve this problem we build an auxiliary graph H form G as follows. Observe that if G has a cycle of length k containing v, then with probability at least of all cycles of H are multiples of k. Thus, the aforementioned cycle of H, if exists, is the smallest possible cycle. Then we can decide the existence of such a cycle using the algorithm of part (iv). We can decide the existence of such a cycle with high probability by repeating the above algorithm O (2k) (k−1) times.
For the next set of examples we use the well-known classical algorithm Depth First Search (DFS). This algorithm builds a spanning forest of a given graph G. It is similar to the BFS algorithm but instead of using a queue which is a first in first out list, it uses a stack which is a first in last out list. This algorithm can also be implemented recursively (see Algorithm 2).  (iii) [strongly connected components] The quantum query complexity of finding strongly connected components of G is O(n 3/2 ). Note that two vertices u, v ∈ V belong to the same strongly connected component iff there exists a directed path from u to v and a directed path from v to u in G.
The problem of topological sort is an important problem in large networks and job scheduling. There are several classical algorithms for this problem. The first algorithm is by Kahn [Kah62]. In this algorithm at each step we add all vertices that do not have any incoming edges to the sorted list, and then eliminate them from the original graph. We continue this process until we add all vertices to the sorted list. Another algorithm for this problem, which we use in this proposition, is based on the DFS algorithm, first stated by Tarjan [Tar76]. Note that in these classical algorithms ones needs to read the entire input to discover the structure of the graph, so their query complexity is O(n 2 ). To the author's knowledge this proposition gives the first non-trivial quantum query complexity upper bound for the topological sort problem. The problem of finding (strongly) connected components of a (directed) graph has been first shown to have query complexity Θ(n 3/2 ) in [DHHM06].
Proof. (i) For a classical algorithm for this problem, run DFS and return vertices in their reverse of finishing time. For a G-coloring of the associated decision tree T , color every edge with label 0 black and every other edge red. Then as before there are at most G = n red edges in every path from root to leaves of T . Also the depth of the decision tree is T = n 2 . Thus we obtain the bound of O( √ GT ) = O(n 3/2 ) on quantum query complexity of topological sort.
(ii) We again use the DFS algorithm on G and whenever the stack becomes empty a new connected component has been found. The G-coloring of the associated decision tree is as in part (i), and the bound of O(n 3/2 ) is derived similarly.
(iii) As a classical algorithm for this problem we use two DFS calls. In the first one we run the DFS algorithm on a reverse graph G R whose adjacency matrix is the transpose of the adjacency matrix of G, i.e., (u, v) Observe that every query to G R is equivalent to a query to G. In the second one, the DFS will be run on the graph G in the reverse finishing time ordering 4 of vertices from the first DFS run. Here we use the fact that if we start the DFS somewhere in a sink component 5 then we exactly traverse that component. In the resulted DFS forest, vertices in every tree are in the same strongly connected component. For a G-coloring of the decision tree, we color every edge with label 0 black and every other edges red, so that G ≤ 2n. The depth of the decision tree is T = n 2 . Therefore, the quantum query complexity of this problem is O(n 3/2 ).
The following corollary is a simple consequence of Corollary 8.
Corollary 11. The quantum query complexity of every graph property of a general graph 6 in the adjacency matrix model, is O(n |E(G)|) which is faster than the trivial algorithm when |E(G)| = o(n 2 ). In particular, every sparse graph property in the adjacency matrix model has quantum query complexity O(n 3/2 ).
The fact that any sparse graph property (particularly minor-closed graph properties) have quantum query complexity O(n 3/2 ) has been proven in [CK].

Graph properties in the adjacency list model
In this subsection we present some bounds on the quantum query complexity of some graph properties when the underlying graph is given in the adjacency list model. Let us first describe what we mean by this model.
In the adjacency list model we assume that the graph is given by an array of size n(n − 1) which for simplicity we think of it as a matrix of size n × (n − 1). The j-th row of this matrix is a list of neighbors of the j-th vertex v j of the graph. Assume that v j has degree d v j . Then the first d v j coordinates of the j-row contain the indices of the neighbors of v j (in some order), and the last n − 1 − d v j coordinates are filled with a nil symbol. See Figure 2 for an example. Any query in the adjacency list model corresponds to a pair (v j , i) with i ≤ n − 1. If i ≤ d v j , then the output of this query is the i-th adjacent vertex of v j in G. If i > d v j , the output of this query is nil. This model can also be defined for directed graphs similarly. The only difference is that the j-th row of the matrix contains vertices that can be reached from v j by a directed edge.
In the following we will use the BFS algorithm in the adjacency list model (see Algorithm 3) as a primitive to use Theorem 4. In the decision tree T associated to this BFS algorithm, each node (query) corresponds to a pair (v, i). The set of possible answers to such a query is the vertex set of G which we partition as follows. We let W (v, i) be the set of vertices that has been added to the BFS tree before querying (v, i). The point is that the BFS algorithm is ignorant of the exact answer of the query (v, i) once it makes sure that it belongs to W (v, i) (see Figure 2 for an example). Thus in the decision tree T we identify the outgoing edges of (v, i) with labels in W (v, i). All the other outgoing edges remain untouched. Now the G-coloring of T is as follows: we color the outgoing edge of (v, i) with label W (v, i) black, and the rest of outgoing edges red. We note that there are n vertices to be added to the BFS tree one-by-one, and we face a red edge once we add a new vertex or a nil. Then in total we see at most G = O(n) red edges in every path from the root to leaves of T . Also the total number of queries in the BFS algorithm equals the number of edges of G denoted by m = |E(G)| plus n. This is because as we do not know the degrees of vertices, we would stop querying neighbors of a vertex after seeing a nil symbol. This adds an extra query for every vertex. Thus T = m + n, and the quantum query complexity of finding the BFS tree in the adjacency list  Having query access to the adjacency list of a directed graph G, it has been proved in [DHHM06] that finding a minimum spanning tree of G has quantum query complexity O( √ mn). Using minimum spanning tree one can prove that checking directed st-connectivity and graph bipartiteness have quantum query complexity O( √ mn) in the adjacency list model. Lin and Lin [LL16] proved the upper bound of O(n 7/4 ) for the problem of maximum bipartite matching in the adjacency matrix model. Here using their ideas we prove the first non-trivial upper bound for this problem in the adjacency list model.
Proof. (i) To find a shortest path we run the BFS algorithm in the adjacency list model starting from the vertex s. Then s and t will be connected in the resulting spanning forest with their shortest path. As discussed before, the quantum query complexity of finding this BFS spanning forest is O (m + n)n . Thus a shortest path between s, t can be found with O (m + n)n quantum queries.
(ii) In the classical algorithm for this problem we start by finding a spanning tree on G by running the BFS Algorithm 3. We then color vertices of G using the resulting spanning forest S with two colors blue and green. We color every vertex of G with even depth in S blue, and every vertex with odd depth in S green. Then we search for two adjacent vertices in G with the same color. If we find such an edge, the graph is not bipartite, and is bipartite otherwise. The G-coloring of the associated decision tree T is as follows. In the first part that we run the BFS algorithm the G-coloring is as before. In the second part that we search for an edge between two vertices of the same color, we partition the set of possible answers (vertices of G) to in two parts: the set of blue vertices and the set of green vertices. As we query (v, i), i.e., the i-th neighbor of v in G, the color of the two outgoing edges associated to this query labeled by sets of blue and green vertices would be colored as follows: if v is blue, the outgoing edge of blue vertices is colored red and the other one is colored black; if v is green the outgoing edge of green vertices is colored red and the other one is colored black. Observe that in the second part of the algorithm, once we see a red edge of T the algorithm halts (and G would not be bipartite). Thus in total we see at most G = n red edges in any path from the root to leaves of T . On the other hand, the depth of the decision tree is T = m + n. Therefore, the quantum query complexity of this problem is O (m + n)n .
two end edges not in M and alternates between edges of the graph that belong to M and edges that do not. Swapping these edges from being in M to not being in M would increase the size of matching by one. However, instead of finding just an augmenting path in each iteration of the algorithm, it finds a maximal set of shortest vertex disjoint augmenting paths. After only O( √ n) iterations, the maximum matching would be found. Since all queries to the input are made inside calls to the BFS Algorithm, the G-coloring of the associated decision tree, is as for BFS algorithm. There are O( √ n) calls to BFS algorithm (Line 2 in Algorithm 4 repeats O( √ n) times), so we have G = n √ n and the depth of the decision tree is T = m + n, where those n extra queries are for the nils. Therefore, the quantum query complexity of this problem is O n 3/4 (m + n) .
(iv), (v) The algorithms are similar to those of Proposition 10 and the G-coloring is as above, so we skip the details.

Concluding remarks
In this paper we generalized a result of [LL16] that is a method for designing quantum query algorithms based on classical ones. Our generalization of [LL16] is twofold: first, we assume that the input alphabet of the function may be non-binary; second, we assume that in a decision tree the outgoing edges connected to a vertex may be indexed by subsets in a partition of the input alphabet set. These two enabled the possibility of using this method, in particular, for graph properties in the adjacency list model. Our proof of this generalization is based on span programs in the non-binary case as well as the dual adversary bound.
Let us at this stage review different approaches we have in proving Lin and Lin's results in [LL16] as well as Theorems 4 and 5: • The first idea in [LL16] is to use the notion of bomb query complexity, which we did not mention here. It is an interesting question that whether this idea can be extended to prove our generalized results (Theorems 4 and 5).
• The second idea in [LL16] is to use a modified version of Grover's search to find mistakes of the guessing algorithm. However, a naive application of Grover's search here results in an extra logarithmic factor for error reduction. It is shown in [LL16] that for functions with binary inputs this undesired factor can be eliminated using properties of the so called γ 2 norm. It seems plausible that the first part of Theorem 4 is provable by the same technique. However, it is not clear to us whether the second part of Theorem 4 or Theorem 5 are achievable taking the same path.
• The third idea is to use the notion of non-binary span program as we did for a proof of Theorem 2. The idea is to use a "st-connectivity type span program" (taken from [BR]) in order to reach from the root of a decision tree to some leaf. However, to not end up with the trivial upper bound of T (the depth of the decision tree) on the quantum query complexity, we equipped edges of the decision tree with some weights that are chosen based on a Gcoloring. Incorporating these weights in the span program the desired result was obtained.
• The last idea is to use the dual adversary bound. This approach is essentially the same as the approach of span programs, but with the advantage that it does not give an undesirable extra factor of √ − 1 as explained before. Comparing to the first two methods, we believe that the ideas of using span programs and dual adversary bound are more advantageous since the choice of weights in these approaches is arbitrary. For proving Theorem 4 the weights that we chose were among two possible choices. We then in Theorem 5 showed how using a larger set of weights we may further improve the upper on the quantum query complexity. Thus, a possible direction to extend our results is to further investigate other possible choices for the weights.
One may suggest that our generalized non-binary version of the result of [LL16] can be obtained simply by representing non-binary inputs of f : [ ] n → [m] by binary strings, simulating a single non-binary query by log( ) binary ones, and then using the result of [LL16] in the binary case. Even ignoring the extra log( ) factor we obtain in this method, we argue that this approach does not work. First, in our notion of generalized decision tree we allow to identify some edges in the decision tree and label its edges with subsets of [ ]. This is missing in the notion of decision tree in [LL16]. Identification of edges is a necessary part of our results especially in the examples of graph properties in the adjacency list model. To elaborate the second limitation of this approach, let us think of the example of minimum finding explained in Proposition 7. Suppose that = 8 and at some stage of the algorithm our candidate for minimum is 6 that is equal to (1, 1, 0) in the binary representation. Then we read the first bit of the next number in the list and find it to be equal to 1. This means that the next number in the list is one of the numbers 4, 5, 6 or 7. In the algorithm and its associated G-coloring, there is a difference between 4, 5 and 6, 7 since the first two are smaller than 6. Indeed, in our proposed G-coloring edges 6, 7 are merged to a single edge with black color, while the edges 4 and 5 are colored in red. Therefore, to convert this coloring to a G-coloring in the binary decision tree whose edges are labeled by binary inputs, we have no choice but coloring the edge with label 1 by red. Then the parameter G of the new G-coloring not only scales by a factor of log , but also is increased by something like T − G because of such extra red edges. In summary, in order to use the result of [LL16] in the binary case to prove our generalized result in the non-binary case, we need to convert a G-coloring of a generalized non-binary decision tree to a G-coloring of a binary decision tree. It is now clear how this can be done in general without drastically weakening our bound on the parameter G.
Our results give bounds on the space complexity of our algorithms as well. The point is that the space complexity of a quantum algorithm based dual adversary bound, is bounded by the logarithm of the dimension of the vectors in the feasible solution of the dual adversary SDP. In our proofs the dimension of such feasible solutions is of the order of the size of the decision tree. Thus the space complexity of our algorithms equals the logarithm of the size of the corresponding decision tree.
In particular, since in our examples (especially those for graph properties) the sizes of decision trees is exponential, the space complexity of our quantum algorithms is linear. 7 Prior to our work designing a span program based quantum query algorithm for directed graphs was not an easy task. We eased the process of designing such algorithms by relating them to classical decision trees. Comparing to span programs for undirected graphs, however, the size of these span programs for directed graphs is exponential. It would be interesting to see if we can decrease the space complexity of such quantum algorithms to logarithmic size.