A distribution testing oracle separation between QMA and QCMA

: It is long-standing open question in quantum complexity theory whether the definition of non-deterministic quantum computation requires quantum witnesses (QMA) or if classical witnesses suffice (QCMA). We make progress on this question by constructing a randomized classical oracle separating the respective computational complexity classes. Previous separations [Aaronson-Kuperberg (CCC'07), Fefferman-Kimmel (MFCS'18)] required a quantum unitary oracle. The separating problem is deciding whether a distribution supported on regular un-directed graphs either consists of multiple connected components (yes instance) or consists of one expanding connected component (no instances) where the graph is given in an adjacency-list format by the oracle. Therefore, the oracle is a distribution over n-bit boolean functions.


Introduction
There are two natural quantum analogs of the computational complexity class NP.The first is the class QMA in which a quantum polynomial-time decision algorithm is given access to a poly(n) qubit quantum state as a witness for the statement.This class is captured by the QMA-complete local Hamiltonian problem [18] in which the quantum witness can be interpreted as the ground-state of the local Hamiltonian.The second is the class QCMA in which the quantum polynomial-time decision algorithm is given access instead to a poly(n) bit classical state.While it is easy to prove that QCMA ⊆ QMA as the quantum witness state can be immediately measured to yield a classical witness string, the question of whether QCMA ?= QMA, first posed by Aharonov and Naveh [4], remains unanswered.If QCMA = QMA, then every local Hamiltonian would have an efficient classical witness of its ground energy; morally, this can be thought of as an efficient classical description of its ground state.The relevance of local Hamiltonians to condensed matter physics makes this question a central open question in quantum complexity theory [2].bounded-depth quantum-classical circuits, [8] introduced a related notion called a "stochastic oracle"-the main difference between this and our model is that a stochastic oracle resamples an instance every time it is queried.
Comparison with previous oracle separations between QMA and QCMA Figure 1 summarizes our work in relation to previous oracle separations.In terms of results, we take a further step towards the standard oracle model-all that remains is to remove the randomness from our oracle.In terms of techniques, we combine the use of counting arguments and the adversary method from previous works with a BQP lower bound for a similar graph problem, due to [6].This lower bound was shown using the polynomial method.We view the judicious combination of these lower bound techniques-as simple as it may seem-as one of the conceptual contributions of this paper.

Intuition for hardness
The expander distinguishing problem is a natural candidate for a separation between QMA and QCMA because it is an "oracular" version of the sparse Hamiltonian problem, which is complete for QMA [10,.To see this, we recall some facts from spectral graph theory.The top eigenvalue of the normalized adjacency matrix A for regular graphs is always 1 and the uniform superposition over vertices is always an associated eigenvector.If the graph is an expander (the NO case of our problem), the second eigenvalue is bounded away from 1, but if the graph is disconnected (the YES case of our problem), then the second eigenvalue is exactly 1.Thus, our oracle problem is exactly the problem of estimating the minimum eigenvalue of I − A (a sparse matrix for a constant-degree graph), on the subspace orthogonal to the uniform superposition state.Viewing I − A as a sparse Hamiltonian, we obtain the connection between our problem and the sparse Hamiltonian problem.
One reason to show oracle separations between two classes is to provide a barrier against attempts to collapse the classes in the "real" world.We interpret our results as confirming the intuition that any QCMA protocol for the sparse Hamiltonian must use more than just black-box access to entries of the Hamiltonian: it must use some nontrivial properties of the ground states of these Hamiltonians.In this sense, it emulates the original quantum adversary lower bound of [9] which showed that any BQP-algorithm for solving NP-complete problems must rely on some inherent structure of the NP-complete problem as BQP-algorithms cannot solve unconstrained search efficiently.
Naturalness of the randomized oracle model Some care must be taken whenever one proves a separation in a "nonstandard" oracle model-see for instance the "trivial" example in [1] of a randomized oracle separating MA 1 from MA.We believe that our randomized oracle model is natural for several reasons.Firstly, as mentioned above, randomization was used in the quantum oracle of [13] for essentially the same reason: to impose a restriction on the witnesses received from the prover.Secondly, it is consistent with our knowledge that our oracle separates QMA from QCMA even when the randomness is removed (and indeed we conjecture this is the case, as described below.)Thirdly, the randomization still gives the prover access to substantial information about the graph: in particular, the prover knows the full connected component structure of the graph.As we show, this information is enough for the prover to give a quantum witness state, that in the YES case convinces the verifier with certainty.Our result shows that even given full knowledge of the component structure, the prover cannot construct a convincing classical witness-we believe this sheds light on how a QMA witness can be more powerful than a QCMA witness.

Overview of proof techniques
Quantum witnesses and containment in oracular QMA A quantum witness for any YES instance graph is any eigenvector |ξ⟩ of eigenvalue 1 that is orthogonal to the uniform superposition over vertices.The verification procedure is simple: project the witness into the subspace orthogonal to the uniform superposition over vertices, and then perform one step of a random walk along the graph, by querying the oracle for the adjacency matrix in superposition.Verify that the state after the walk step equals |ξ⟩.This is equivalent to a 1-bit phase estimation of the eigenvalue.If a graph is a NO instance, then there does not exist any vector orthogonal to the uniform superposition (the unique eigenvector of value 1) that would pass the previous test.
Whenever, the graph has a connected component of S ⊊ V , then an eigenvector orthogonal to the uniform superposition of eigenvalue 1 exists.When |S| ≪ N , this eigenvector is very close to |S⟩, the uniform superposition over basis vectors x ∈ S. Notice that this state only depends on the connected component S and not the specific edges of the graph.Furthermore, the state |S ′ ⟩ for any subset S ′ that approximates S forms a witness that is accepted with high probability.

Lower bound on classical witnesses
The difficulty in this problem lies in proving a lower bound on the ability for classical witnesses to distinguish YES and NO instances.To prove a lower bound, we argue that any quantum algorithm with access to a polynomial length classical witness must make an exponential number of (quantum) queries to the adjacency list of the graph in order to distinguish YES and NO instances.This, in turn, lower bounds the time complexity of any QCMA algorithm distinguishing YES and NO instances but is actually slightly stronger since we don't consider the computational complexity of the algorithm between queries.
Proving lower bounds when classical witnesses are involved is difficult because the witness could be based on any property of the graph.For example, the classical witness could describe cycles, triangles, etc. contained in the graph -while it isn't obvious why such a witness would be helpful, proving that any such witness is insufficient is a significant challenge.One way to circumvent this difficulty is to first show a lower bound assuming some structure about the witness 4 , and then "remove the training wheels" by showing that the assumption holds for any good classical witness.
Lower bound against "subset witnesses" One structure we can assume is that the witness only depends on the set of vertices contained in the connected component S.This is certainly the case for the quantum witness state in eq.(24).Our result shows that any polynomial-length witness only depending on the vertices in S requires an exponential query complexity to distinguish YES and NO instance graphs.
The starting point for this statement is the exponential query lower bound in the absence of a witness (i.e. for BQP) for the expander distinguishing problem proven by Ambainis, Childs and Liu [6], using the polynomial method.In [6], the authors define two distributions over constant-degree regular colored graphs: the first is a distribution P 1 over random graphs with overwhelming probability of having a second normalized eigenvalue at most 1 − ϵ 0 .The second is a distribution P ℓ over random graphs with overwhelming probability of having ℓ connected components.Since, almost all graphs in P 1 are NO graphs and all graphs in P ℓ are YES graphs, any algorithm distinguishing YES and NO instances must be able to distinguish the two distributions.We first show that a comparable query lower bound still holds even when the algorithm is given a witness consisting of polynomially many random points F from any one connected component.
Next, we show that if there were a QCMA algorithm where the optimal witness depends only on the set of vertices S in one of the connected components, by a counting argument, there must exist a combinatorial sunflower of subsets S that correspond to the same witness string.A sunflower, in this context, is a set of subsets such that each subset contains a core F ⊂ V and every vertex of V \ F occurs in a small fraction of subsets.This implies that there exists a BQP algorithm which distinguishes YES instances corresponding to the sunflower from all NO instances.Next, we show using an adversary bound [5], a quantum query algorithm cannot distinguish the distribution of YES instances corresponding to the sunflower from the uniform distribution of YES instances such that the core F is contained in a connected component (the ideal sunflower).
This indistinguishability, along with the previous polynomial method based lower bound, proves that QCMA algorithm -whose witness only depends on the vertices in the connected component -for the expander distinguishing problem must make an exponential number of queries to the graph.
Removing the restriction over witnesses Our proof, thus far, has required the restriction that the witness only depends on the vertices in the connected component.In some sense, this argues that there is an oracle separation between QMA and QCMA if the prover is restricted to being "near-sighted": it cannot see the intricacies of the edgestructure of the graph, but can notice the separate connected components of the graph.If the near-sighted prover was capable of sending quantum states as witnesses, then she can still aid a verifier in deciding the expander distinguishing problem, whereas if she could only send classical witnesses, then she cannot aid a verifier.
It now remains to remove the restriction that the witness can only depend on the vertices in the connected component.We do this by introducing randomness into the oracle, precisely designed to "blind" the prover to the local structure of the graph.In the standard oracle setting, the verifier and prover both get access to an oracle x ∈ {0, 1} N , and the prover provides either a quantum witness, |ξ(x)⟩ ∈ (C 2 ) ⊗poly(n) or a classical witness, ξ(x) ∈ {0, 1} poly(n) .The verifier then runs an efficient quantum algorithm V x which takes as input |ξ(x)⟩ (or ξ(x), respectively) and consists of quantum oracle gates applying the unitary transform defined as the linear extension of |i⟩ → (−1) We now extend modify this setup slightly.Instead of a single oracle x, we consider a distribution B over oracles.The prover constructs a quantum witness |ξ(B)⟩ (or a classical witness ξ(B), respectively) based on the distribution B. The verifier then samples a classical oracle x ← B from the distribution, and then runs the verification procedure V x which takes as input |ξ(B)⟩ (or ξ(B), respectively) and applies quantum oracle gates corresponding to x.
The success probability of the verifier is taken over the distribution B and the randomness in the verification procedure.From our previous observations, graphs with the same connected component S have the same ideal witness state (given in eq. ( 24)).So, if the distribution B is supported on all graphs with the same connected component S, then the witness state from eq. (24) suffices.Furthermore, in the case of the classical witness system, the witness can only depend on S and the previously stated lower bound applies.This motivates the oracle problem of distinguishing distributions, marked either YES or NO, over 2 n bit strings (or equivalently n-bit functions).

Statement of the result
Theorem 1.For every sufficiently large integer n that is a multiple of 200, there exist distributions over 100-regular 100-colored graphs on N = 2 n vertices labeled either YES or NO such that • Each YES distribution is entirely supported on YES instances of the expander-distinguishing problem and, likewise, each NO distribution is entirely supported on NO instance of the expander-distinguishing problem.
• There exists a poly(n) time quantum algorithm V q taking a witness state |ξ⟩ as input and making O(1) queries to the quantum oracle such that 1.For every YES distribution B, there exists a quantum witness |ξ⟩ ∈ (C 2 ) ⊗n such that 2. For every NO distribution B, for all quantum witnesses |ξ⟩ ∈ (C 2 ) ⊗n , • Any quantum algorithm V c accepting a classical witness of length q(n) satisfying the following two criteria either requires q(n) to be exponential or must make an exponential number of queries to the oracle.
1.For every YES distribution B, there exists a classical witness 2. For every NO distribution B, for all classical witnesses ξ ∈ {0, 1} q(n) , Although our main theorem is formulated as a query lower bound, it can be converted to a separation between the relativized classes of QMA and QCMA via a standard diagonalization argument.Similarly, it was pointed out to us [14] that it proves a separation between the relativized classes of BQP/qpoly and BQP/poly, following the technique of [3].

Implications and future directions
There are several future questions raised by this work that we find interesting:

Oracle and communication separations
The most natural question is, of course, whether the oracle's randomness can be removed to obtain a separation in the standard model.We conjecture that our problem yields such a separation, but a new technique seems necessary to prove it.See Section 9 for more details on the technical barriers to derandomizing our construction.Another natural question is to show a communication complexity separation between QMA and QCMA.This has been shown for one-way communication by Klauck and Podder [19] but their problem does not yield a separation for two-way communication.Could our query separation be lifted to the communication world by use of the appropriate gadget?
The class QMA(2) is another relative of QMA which is perhaps even more enigmatic than QCMA.In QMA(2), the witness state is promised to be an unentangled between the first and second half of the qubits.We do not even know of a quantum (unitary) oracle separation between QMA(2) and QMA, nor do we have a natural candidate problem.Could we at least formulate such a candidate by considering "oracular" versions of QMA(2)complete problems, in analogy to what we do in this work for QCMA.

Search-to-decision
In [16], Irani, Natarajan, Nirkhe, Rao and Yuen studied the complexity of generating a witness to a QMA problem (equivalently, generating a ground state of a local Hamiltonian) when given oracle access to a QMA oracle.This paradigm, called search-to-decision, is commonplace in classical complexity theory (for example, P, NP, MA, etc. all have search-to-decision reductions) and yet [16] gives evidence that QMA likely does not exhibit a search-to-decision reduction.They prove this by showing an oracle relative to which QMA search-to-decision reductions are provably impossible.The oracle used is identical to that of Aaronson and Kuperberg [3] to separate QMA and QCMA.[16] acknowledge this noncoincidence and conjecture whether any QMA and QCMA separating oracle yields a QMA search-to-decision impossibility result.Similar to the reasons for why the gold-standard of oracle separation between QMA and QCMA is a n-bit boolean function, the ideal oracle for proving QMA search-to-decision impossibility is also a n-bit boolean function.Does the oracle presented here also yield a search-to-decision impossibility?
Implications for Quantum PCPs The quantum PCP conjecture [4] is one of the biggest open questions in quantum complexity theory.In a recent panel [25] on the quantum PCP conjecture and the NLTS theorem [7], an interesting question was posed of whether MA or QCMA (lower or upper) bounds can be placed on the complexity of the promise-gapped local Hamiltonian problem.We recommend [24] for an introduction to the subject.Because the oracle presented in this result corresponds to a sparse Hamiltonian with a problem of deciding if the second eigenvalue of the Hamiltonian is 1 or < 1 − α/d = 1 − Ω(1), one might wonder if this provides oracular evidence that quantum PCPs are at least QCMA-hard.Unfortunately, to the best of our knowledge, this is not a reasonable conclusion.While we give evidence that the promise-gapped sparse Hamiltonian problem is likely QCMA-hard, the reduction from the sparse Hamiltonian problem to the local Hamiltonian problem does not imply that the promise-gapped local Hamiltonian problem is likely QCMA-hard.The only algorithm known for checking a witness for the sparse Hamiltonian problem is Hamiltonian simulation on the witness which is not a local algorithm.

Connections to Stoquastic Hamiltonians
Since the oracles studied in this work correspond to the adjacency lists of graphs, they can be viewed as sparse access to a Hamiltonian H which is the Laplacian of a graph (recall that if the adjacency matrix is A, then the Laplacian is I − A/d).Such Hamiltonians have a special structure not present in general Hamiltonians: they are stoquastic, meaning that the off-diagonal entries are nonpositive.The local Hamiltonian (LH) problem for stoquastic Hamiltonians is significantly easier than the general LH problem, and in some cases is even contained in MA as shown by Bravyi and Terhal [11].It is worth noticing why this is not in tension with our result-in particular, why this does not imply that our oracle problem is contained in oracular MA.
• Crucially, the MA-containment for stoquastic LH holds only for the ground state: this is because of the Perron-Frobenius theorem, which implies that ground states of such Hamiltonians have nonnegative coefficients.However, in our case, we want the first excited state: the state of minimum energy for H restricted to the subspace orthogonal to the uniform superposition.It was shown by [17] that all excited state energies are QMA-hard to calculate for a stoquastic Hamiltonian.
• The MA containment also uses the locality of the Hamiltonian, which in turn imposes a strong structure on the adjacency matrix of the graph.The random graphs we consider will not have this structure.(While it was shown by [12] showed an AM algorithm for calculating the ground energy stoquastic and sparse Hamiltonians, again this does not apply to higher excited states.) • At an intuitive level, in graph language, the LH problem for stoquastic Hamiltonians is to find a component of the graph where the average value of some potential function (given by the diagonal entries of H) is minimized.An MA verifier can solve this by executing a random walk, given the right starting point by Merlin.In contrast, our problem is to determine whether the graph as a whole is connected-a global property which an MA verifier cannot determine.

Organization of the paper
The remainder of the paper is the proof of Theorem 1.The proof is divided into smaller components and these intermediate results are joined together in Section 8.In Section 3, we state some basic definitions and formally define the expander distinguishing problem.
In Section 4, we describe the distributions over graphs that constitute YES and NO instances.In Section 5, we prove that there is an efficient QMA algorithm for the expander distinguishing problem.In particular, there is a single quantum witness that serves all the graphs in each of the YES distributions.In Section 6, we use the adversary method and counting arguments to prove that any QCMA algorithm for the expander distinguishing problem for the constructed distributions implies a BQP algorithm for distinguishing YES instances with a connected component corresponding to an ideal sunflower from a generic NO instance.In Section 7, we argue using the polynomial method that such an algorithm is impossible without an exponential query complexity.In Section 9, we present some concluding remarks about our construction and its relation to other notions of computational complexity.Appendices A and B consist of omitted proofs.

Notation and quantum information basics
We will assume that the reader is familiar with the basics of quantum computing and quantum information.We will use N def = 2 n throughout this paper and we will only consider graphs of N vertices.The adjacency list of a d-regular d-colored graph on N vertices takes dnN bits to describe.For any m, we abbreviate the set of integers {1, 2, . . .m} as [m].For a set A ⊆ [N ], we will use |A⟩ to denote the state 1 √ A j∈A |j⟩, the subset state corresponding to A. Unless otherwise, specified we assume ∥•∥ is the Euclidean norm ∥•∥ 2 for a vector, and the spectral norm for a matrix, which is the largest singular value.

Expander graphs
Definition 2. A graph G is a spectral α-expander (equiv.is α-expanding) if the second highest eigenvalue λ 2 of the normalized adjacency matrix of G satisfies λ 2 ≤ 1 − α.We say that a connected component S of the graph is α-expanding if the restricted graph to the vertices of S is α-expanding.Lemma 3. Let G be a d-regular α-expander.Consider the random walk that starts in any distribution over the vertices, and at each time step, stays in place with probability 1/2, and moves along an edge of the graph with probability 1/2.Then for any vertex v, after ℓ steps, the probability Pr[v] that the walk is in v satisfies In particular, when ℓ = O(c log N/α) we can get the RHS to be 1/N c .
Proof.Let the normalized adjacency matrix of G be A, and let A ′ = 1 2 (I + A) be the transition matrix of the random walk.If G is a d-regular α-expander then (1/N )1 is the unique eigenvector of A (and A ′ ) with eigenvalue 1.Moreover, since ∥A∥ ≤ 1, all eigenvalues of A ′ are nonnegative.Since G is an α-expander, the second eigenvalue of A is at most 1 − α, and thus the second eigenvalue of A ′ is at most 1 − α/2.
Let u be a vector representing a probability distribution over vertices of G (i.e.u ∈ R N + with ∥u∥ 1 = 1), and let 1 be the N -dimensional all-ones vector.Then the statement we wish to prove is equivalent to Write u = (1/N )1 + δ.By the condition that 1 = ∥u∥ 1 = i u i = 1 + i δ i , it holds that ⟨δ i , 1⟩ = 0. Then Setting this quantity equal to 1/N c and solving for ℓ, we get We note that due to parallel repetition,

Non-deterministic oracle problems
Likewise, for QCMA O .This justifies removing the constant ϵ from the definition.We now define the same problem for oracles equaling distributions over n-bit boolean functions.

Definition 5 (Random classical oracles).
A random oracle R is a distribution over classical oracles {O}.We say an oracle decision problem L R is in QMA R (ϵ) if there exists a uniform family of quantum circuits A O such that 1.For every YES instance R, there exists a quantum state |ξ⟩ of poly(n) qubits such that 2. For every NO instance O, for all quantum states |ξ⟩ of poly(n) qubits, QCMA R (c, s) is defined similarly, except the state |ξ⟩ is promised to be classical.
Ideally, we would define the classes QMA R and QCMA R are defined as QMA R (1/3) and QCMA R (1/3), respectively.However, the parallel repetition argument for boolean function oracles cannot be extended to distributions over boolean functions.This is because the ϵ error that an algorithm is the expectation of the success probability of the algorithm over the distribution.It is possible that the algorithm runs on every instance in the distribution with error ϵ or it is possible that the algorithm succeeds with 0 error on a 1 − ϵ fraction of the distribution and fails on the remaining ϵ fraction.In the first case, the success of the algorithm can be improved with parallel repetition while it cannot in the second case5 .

Graph oracles
exists in G and is colored with color κ.

Definition 7 (Adjacency graph oracles). Let G be a d-colored d-regular undirected graph. The graph G = (V, E) can be described by an adjacency function
where the output of (j, κ) returns the neighbor of j along the edge colored with κ.Quantum access to the function G is provided by the following oracle unitary: We call the function G the adjacency graph oracle corresponding to G.

Definition 8 (Expander distinguishing problem). The (α, ζ)-expander distinguishing problem is a promise oracle language where the input is an oracle G for a d-colored d-regular
undirected graph G on N vertices.The problem is to distinguish between the following two cases, promised that one holds: • NO: the graph G is an α-expander.
In this paper, we will think of α as a constant and ζ ∼ N 9/10 .To simplify notation, since the oracles considered in this result always correspond to graphs G, we express the algorithm as A G rather than A O .

Random distributions over graphs with many connected components
In this subsection, we describe distributions over graphs where the graphs with high probability consist of ℓ connected components.It should not be surprising that the distribution is almost identical to the distribution used by Ambainis, Childs, and Liu [6] in their proof that the expander distinguishing problem requires an exponential number of quantum queries for any quantum query algorithm in the absence of a proof.This is because we will reduce any QCMA algorithm to an efficient query algorithm for some expander distinguishing problem.
The lower bound in [6] is crucially a lower bound on the polynomial degree of any polynomial that distinguishes two graph distributions.From there, it isn't too much to argue that these graph distributions are very close to YES and NO instances as prescribed in the expander distinguishing problem; therefore any algorithm solving the expander distinguishing problem must be able to distinguish these two graph distributions.Our first goal is to amplify the argument of [6] to a more restricted class of graphs.
[6] Graphs The goal of the construction is a distribution which depends on an integer ℓ and a subset F ⊂ V .The integer ℓ will roughly correspond to the number of connected components (henceforth denoted C 1 , . . ., C ℓ ) in the graph and we insist that F ⊂ C 1 .Every v ∈ V \ F appears in each subset C i with equal probability of 1/ℓ.The actual construction will be slightly more complicated than this but, morally, this is what we hope to achieve from the distribution.

Formal construction
Let N be an integer and for integer M ≥ N , integer ℓ dividing M and a subset F ⊂ V define the distribution P M,ℓ (F ) over graphs on N vertices as follows: On each subset V k , create a random colored subgraph by randomly choosing d perfect matchings (each with a different color 1, . . ., d) and taking their union.

2.
To construct the graph G on N vertices: We first choose an injective map ι : V → V ′ .First, we pick a function k : V → [ℓ].We pick k as a uniformly random function conditioned on the fact that k(j) = 1 for each j ∈ F .Let ι(v) be a random vertex from V k(j) without replacement to satisfy injectivity.If all vertices from V k(j) have been selected with replacement, output the graph on N vertices with no edges (i.e.abort).
3. Induce a graph G on V from G ′ and the map ι -i.e. an edge 4. For a vertex j and a color κ, if the previous induced edges did not introduce a κ-colored edge from j, then add edge (j, j, κ).
5. The distribution over graphs G is henceforth called P M,ℓ (F ); when F = ∅, we write it as P M,ℓ .

Setting of constants
The lower bounds we prove for the QCMA algorithm are by no means tight (up to constants).We make no attempt to perfect the choice of constants as our only goal is to prove an exponential lower bound on the size of the any quantum witness or the number of queries required to solve the expander distinguishing problem.For this reason, we pick the following constants: Chosen constants The degree of the graph G ′ is set to be d = 100.We assume ℓ = N 1/10 , γ = N −1/10 and M = (1 + γ)N .

Induced constants
In Definition 8, we define an (α, ζ)-expander distinguishing problem.We will only consider α = 1/(2 • 10 8 ) (which is a consequence of Lemma 10 and the chosen constants).Notationally, we will use Conventions Typically, we will assume (for the purposes of contradiction) that |F | ≤ N 1/100 but as that is a term we wish to bound, we explicitly state it each time.Anytime a set S is described, it will be of size ζ, but we will also state this.

Concentration bounds for random distributions over graphs
We will need the following concentration lemma about the generated distributions.The lemma proves that P M,1 is approximately a YES instance and that P M,ℓ (F ) is approximately a NO instance.Overall, this lemma proves that any algorithm solving the expander distinguishing problem must do very well on identifying the distribution P M,1 as a NO instance and identifying the distributions P M,ℓ (F ) as a YES instance.The proof of this lemma is provided in Appendix A.

Likewise, the probability that a graph drawn from the distribution P
Note that being expanding necessarily implies connectivity.Note that when F = ∅ or ℓ = 1, there are simpler proofs with tighter bounds but the bound proven here for the general statement is sufficient for our result.
The second concentration lemma that we will use is that P M,ℓ is approximately equal to sampling a set F of size ≤ N 1/100 and then sampling a graph from P M,ℓ (F ).The proof of this lemma is also provided in Appendix A.
Lemma 11.Let m be ≤ N 1/100 .Let D 1 be the distribution on pairs (G, F ) obtained by sampling G ∼ P M,ℓ , choosing a uniformly random vertex v ∈ G, and then choosing F to be a uniformly random subset of the connected component of G containing v of size m.Let D 2 be the distribution on pairs (G, F ) obtained by first choosing F to be a uniformly random subset of V with size m, and then sampling G ∼ P M,ℓ (F ).Then these distribution are close in statistical distance:

QMA protocol
In this section we show that the expander distinguishing problem (over a fixed graphi.e., no distribution) can be solved with a polynomial number of queries (indeed, with just two queries) if a quantum witness is provided.Our algorithm has the added benefit of being time-efficient, so we have shown that this problem is contained in QMA G .In Section 8, we prove that there still exists a QMA protocol if we consider distribution oracles.

Lemma 12.
There is a QMA G protocol A QMA that solves the (α, ζ)-expander distinguishing problem with the following properties: 1. Query complexity: the algorithm makes two queries to G.

Completeness:
In the YES case, there exists a witness state that the verifier accepts with certainty.
3. Soundness: In the NO case, no witness state is accepted by probability greater than 1 − α/4.

Nice witnesses:
is accepted with probability at least 1 − |S|/N .In particular, since there exists a connected component of size at most ζ, there is a state of this form that is accepted with probability 1 − ζ/N .
Proof.First, we note the following fact: given access to the adjacency graph oracle for a d-regular graph, we can implement the unitary U walk defined by where G(j, κ) is the κ-th neighbor of j (which is guaranteed to exist by the d-regularity condition).To see this, prepare an ancilla in the |0⟩ state, and apply Here, we have used the fact that for a validly colored graph, if G(j 1 , κ) = j 2 , then G(j 2 , κ) = j 1 .Moreover, using controlled queries to G, we can also implement the controlled version of U walk .Let us also define the following states: Let |ψ⟩ be the witness state received from the prover.The verifier performs the following operation: 1. First, the verifier prepares an ancillas in the state |+⟩ and 1 The total state at this point is Now, the verifier applies U walk to the witness and color registers controlled on the control register.
2. Next, the verifier measures the control register in the {|+⟩ , |−⟩} basis, and the color register using the two-outcome measurement {M C 0 , M C 1 } where M C 0 = 0 d .The verifier proceeds to the next step if and only if the control measurement yields + and the color measurement yields 0. Otherwise, it rejects.

Finally, it performs the two-outcome measurement {M
If the outcome 0 is obtained, it rejects.Otherwise, it accepts.
The associated quantum circuit for this verifier is given in Figure 3. Query complexity From the description of the algorithm it is clear that only one query to U walk , and thus two queries to G are made.

Analysis
To analyze the verification algorithm, let us write the witness as Then the state after the controlled gate is If we apply the projector of the third register onto |0 d ⟩, we get the resulting un-normalized state is where A is the normalized adjacency matrix of the graph.The probability of measuring the control register as |+⟩ on this un-normalized state is equal to to the probability that the control register and the color both yield accepting outcomes, and can be calculated to be where in the penultimate line we used that A is a real symmetric matrix, and in the last line we used that A has operator norm at most 1.

Completeness
In the YES case, let S be a connected component and T be its complement.Then the prover will send the state It is easy to check that |ξ S ⟩ is a +1 eigenvector of A, and that it is orthogonal to |0 V ⟩.Therefore, by eq.(23d), it is accepted by the verifier with probability 1.
Soundness Assume that for some NO case graph, the witness |ψ⟩ is accepted with probability 1 − γ for γ < α/4.Note the second eigenvalue λ satisfies |λ| ≤ 1, so δ = 1 − λ ≤ 2, and hence 1 − γ > 1/2.Since step 2 of the verifier must accept with at least this probability, by eq. ( 23e), Since the NO case graph has a single connected component and a unique eigenvector of eigenvalue 1, M V 0 is the projector onto the 1-eigenspace of A. Furthermore, since the graph is α-expanding, every other eigenvector has eigenvalue at most 1 − α.
Solving this equation tells us that Therefore, the measurement of the witness register in step 3 of the verifier will reject with probability 1 − 2γ/α > 1/2, causing the algorithm to reject with probability greater than 1/2, a contradiction.Therefore, γ ≥ α/4.
Nice witnesses Grilo, Kerenedis and Sikora noted in [15] that the witness for a QMA problem can always be assumed to be a subset state.We show that the same property holds for our oracular witnesses.The overlap between the ideal witness and |S⟩ is Therefore, the trace distance between these states satisfies Observe that the entire operation of the verifier given G can be modeled by a single twooutcome measurement {M G 0 , M G 1 }, where 0 corresponds to rejection and 1 to acceptance.
Therefore, by a standard trace distance fact [23, Equation 9.22], if G is a YES instance, the probability that |S⟩ is accepted is at least 6 Adversary method In this section, we use the adversary method of Ambainis to argue that any successful QCMA algorithm implies a BQP algorithm for distinguishing the distribution P M,ℓ (F ) and P m,1 for |F | ≤ N 1/100 .In Section 6.1 we state the adversary method result and in the following sections we prove the statement.

Ambainis' proof of the adversary method
The adversary method of Ambainis [5] is a convenient way of arguing lower bounds on the query complexity of oracular quantum algorithms.The adversary method lower bounds the complexity of any algorithm which (with high probability) computes f (a) for a function f : {0, 1} N → {0, 1}.The quantum algorithm is allowed access to a ∈ {0, 1} N by a oracle gate O which applies linearly the transform |i⟩ → (−1) a i |i⟩ for i ∈ {0, 1} n (here N = 2 n ).In doing so, the adversary method is a convenient way of producing BQP (query) lower-bounds.
To use it in our distributional setting, we make two modifications to the adversary bound.The first is to relax the notion of correctness.The lower bound of Ambainis is for a lower-bound for any algorithm which, for each a ∈ {0, 1} N , outputs f (a) correctly with probability 1 − ϵ for ϵ < 1  2 .We instead consider an average-case notion of success in which By Markov's inequality, this implies or in other words, most a are (with high probability) correctly identified.The second modification is to restrict the set of locations that the algorithm is allowed to query the oracle.The reason for this is somewhat subtle.Essentially, the original lower bound of Ambainis was designed for decision problems with deterministic oracles, and relies on constructing a relation between two disjoint sets of oracle instances, one consisting only of YES instances and the other only of NO instances.However, in our setting, we are interested in distinguishing two distributions over oracles that may have overlapping support.In order to define disjoint YES and NO sets of instances even when the distributions overlap, we add to each oracle string a a set of flag bits b that indicate which of the two distributions the string a was sampled from.Naturally, any reasonable model cannot permit the algorithm to query the flag bits: otherwise, it would be easy to distinguish even two statistically close distributions with few queries.
More formally, we consider a generalization where the oracle string is a tuple (a, b) ∈ {0, 1} N × {0, 1} M and f : {0, 1} N +M → {0, 1} but the algorithm can only query positions of a.In this model, with the average-case notion of success defined above, we obtain the following adversary lower bound for distributions: Corollary 14.Let X and Y be two subsets of {0, 1} N +M satisfying the three conditions listed in Theorem 13.Then, any query algorithm (1 − δ)-distinguishing the uniform distributions over X and Y , must use eq.( 34) queries for ϵ = 2δ.
The proofs of both statements are presented in Appendix B.

Setup from QCMA algorithm
In this subsection, we show that if there is a QCMA algorithm for solving the expander distinguishing problem then there exists a sunflower ❀ (defined below) of YES instances which correspond to the same optimal witness wt ⋆ .If we hardcode wt ⋆ into the QCMA algorithm, we generate a quantum query algorithm that, with no access to a prover, accepts instances corresponding to ❀ and rejects all NO instances.

For all
We call the set F the core of the sunflower.
YES instances corresponding to subsets For any graph G and subset For each subset S of size ζ, define B S to be the restriction of the distribution P M,ℓ to graphs in S ◁ .The intuition is that the witness |ξ S ⟩ from eq. (24) will be a good witness for B S since the connected components of P M,ℓ are of a size concentrated around z.
There is a small complexity, which we address now, in that the distribution P M,ℓ is not a uniform distribution over a set of graphs.To rectify this, we can always assume that the oracle corresponding to a graph G sampled to P M,ℓ consists of a queryable component corresponding to the adjacency list of G and a non-queryable component corresponding to the random coins r G that were flipped in order to generate G according to P M,ℓ .We will also define B S as the restriction of the extended oracle.Therefore, both P M,ℓ and B S are uniform distributions over some support.
Lastly, the distributions B S are not exactly YES distributions since their support is not entirely on YES graphs of the expander distinguishing problem.However, similar to Lemma 10, we will show that B S is almost entirely supported on YES graphs.Therefore, it suffices to use B S as a proxy for YES instances until the very end where we handle this subtlety.QCMA algorithm implies a quantum low-query algorithm for some sunflower ❀ Lemma 18.For some ϵ > 0, assume there exists a k-query non-deterministic quantum algorithm which accepts a q-length classical witness and accepts every distribution B S for subset S of size ζ with probability 1 − ϵ and accepts any NO distribution B NO with probability at most ϵ.Then for µ > 0, there exists a (µ, ζ, 2q/(µ log ℓ))-sunflower ❀ and a k-query quantum algorithm that accepts every distribution B S for S ∈ ❀ and accepts any NO distribution B NO with probability at most ϵ.

Corollary 16. For every graph
Proof.Assume such a non-deterministic algorithm A QCMA exists.Let the optimal witness (for algorithm A QCMA ) for oracle B S be wt(B S ); since the oracles are in bijection with subsets S ∈ V ζ , we can think of wt as function Formally this means that for every S, there exists a k(n)-query quantum algorithm A QCMA and a witness wt(S) such that Let wt ⋆ ∈ {0, 1} q be the most popular witness (one associated with the largest number of subsets S) and let Σ def = wt −1 (wt ⋆ ).The size |Σ| is by a counting argument at least 2 −q N ζ .Notice that since |S| = ζ, then for a uniformly random S ∈ V ζ , Pr S [j ∈ S] = ζ/N for all j ∈ V .Ideally, if S were instead sampled from Σ, we would like that Pr S∈Σ [j ∈ S] ∼ ζ/N for all j ∈ V .Of course, this is too good to be true as Σ could be the set of all S such that 0 n is contained in S (for example).Instead, we build a sunflower ❀ ⊂ Σ with the following greedy strategy inspired by [13].For ❀ = Σ (initially), whenever there exists a j such that then we restrict ❀ ← ❀ ∩ {S : j ∈ S}.By construction, after each restriction, the size of ❀ is at least its size before restriction multiplied by (ζ/N ) 1−µ .If we continue this process and each time add j to a set F , then On the other hand, after selecting the set F , each element of S ∈ ❀ necessarily contains F ⊂ S and therefore Therefore, Combining eq. ( 41) and eq.( 43), we get that The end result is that ❀ is (µ, ζ, 2q(n) µ log ℓ )-sunflower.Let A be the algorithm A QCMA with the witness wt ⋆ hard-coded into the algorithm.Since ❀ ⊂ Σ, for every S ∈ ❀, And since the algorithm A QCMA accepts every NO distribution B NO with probability at most ϵ irrespective of proof, then A also does them same.

Query lower bound for distinguishing sunflowers and fixed distributions
Let This is the ideal sunflower with a core of F .We will show by an adversary bound that the sunflower ❀ and the ideal sunflower F are indistinguishable by quantum query algorithms with few queries.Consider the distribution H ❀ defined by sampling an S ∈ ❀ and then sampling a graph from B S .Similarly, define the distribution H F but by first sampling an S ∈ F .We want to show that any quantum query algorithm requires exponentially many queries to distinguish H ❀ and H F .The main result of this subsection is the following lemma.

A warmup lemma for distinguishing graphs
The main challenge in proving Lemma 19 is the complicated structure inherent in graphs.However, if we work instead directly with the sets S, the problem is much simpler, and was already solved in [13, Lemma 11].They showed that given membership query access (equivalently, the indicator function for the set), it requires exponentially many quantum queries to distinguish a sample from ❀ from a sample from F .We will work up to the result we wish to prove by gradually adding more structure to the objects being queried until we reach graphs.We will start by working with permutations that map the set S to a known set, and show that any algorithm with query access to the permutation and its inverse requires exponentially many queries.
To be precise, let U = [ζ].Let Π ❀ be the set of all permutations and inverses (π, π −1 ) such that π(S) = U for some S ∈ ❀.Similarly, define Π F .We shall abuse notation and also use Π ❀ and Π F to refer to the uniform distributions over these sets of permutations.We first claim that no quantum query algorithm can distinguish the distributions Π ❀ and Π F without an exponential number of queries.Note that the algorithm is allowed to query both the permutation and its inverse 6 .Lemma 20.Any quantum query algorithm (1−δ)-distinguishing the distributions Π ❀ and Π F where ❀ is a (µ, ζ, t)-sunflower and F is the corresponding core requires Proof.We will use the adapted adversary bound (Theorem 13 and Corollary 14) proved in Appendix B. To do so we need to construct a relation R ⊂ Π ❀ × Π F .To build this relation, for every pair (S x , S y ) ∈ ❀ × F , pick permutations χ xy and ψ xy such that 2. For all j ∈ (S x ∩ S y ) ∪ (V \ (S x ∪ S y )), χ xy (j) = ψ xy (j).
Such permutations are easy to find by picking a permutation χ xy and then choosing the unique ψ xy that satisfies the constraints.Notice that every permutation mapping S x to U can be expressed as τ • χ xy such that τ (U ) = U .Likewise, every permutation mapping S y to U can be expressed as τ • ψ xy for τ (U ) = U .Construct the relation R by adding all pairs defined by the same τ : (49) Consider an element (π, π −1 ) ∈ Π ❀ such that π(S x ) = U .For any S y ∈ F , write π = τ • χ xy .Then (τ • ψ xy , ψ −1 xy • τ −1 ) ∈ Π F forms a neighbor of (π, π −1 ) along R. Therefore, the degree m of every element of Π ❀ is | F | and, analogously, the degree m ′ of every element of Π F is |❀|.
-(A1): A simple upper bound for ℓ x,j is | F |. To bound ℓ y,j , notice that since j / ∈ S y , then in order for σ x ′ (j) ̸ = σ y (j), j ∈ S x ′ \ S y .The number of such x ′ is equal to the number of sets S x ′ in ❀ that contain the point σ y (j).Since F ⊂ S y , it follows that the point σ y (j) is not contained in F .Therefore, since ❀ is a sunflower, it holds that the number of S x ′ ∈ ❀ satisfying the condition is at most -(A2): A simple upper bound for ℓ y,j is |❀|.To bound ℓ x,j , notice that since j / ∈ S x , then in order for σ y ′ (j) ̸ = σ x (j), j ∈ S y ′ \ S x .Since F ⊂ S x , then the number of such y ′ is at most • Case (B): By construction, in order for j ′ def = σ −1 x (j) ̸ = σ −1 y (j), either (B1) j ′ ∈ S x \S y or j ′ ∈ S y \ S x .
Since this exhausts all cases, Then by direct application of Corollary 14, the algorithm must use A short corollary of Lemma 20 is that there is a similar query lower bound for distinguishing distributions over graphs.Let G be a distribution over graphs with a connected component of U = [ζ].Let G ❀ be the distribution over graphs formed by sampling a permutation pair (π, π −1 ) from Π ❀ , a graph G from G and outputting the graph π −1 (G).By construction, G ❀ is a distribution over graphs with a connected component of S for S ∈ ❀.Likewise, define the distribution G F .Corollary 21.For δ < 1/4, any quantum query algorithm (1 − δ)-distinguishing the distributions G ❀ and G F where ❀ is a (µ, ζ, t)-sunflower and F is the corresponding core requires Proof.Any algorithm A for distinguishing G ❀ and G F , can be used as a subroutine in a (not necessarily time-efficient) algorithm A ′ for distinguishing Π ❀ and Π F in twice as many queries.
To motivate the algorithm, observe that if G is a random graph drawn from G, then the graph π −1 (G) is a random graph drawn from G ❀ or G F , depending on whether π ∈ Π ❀ or Π F .Moreover, a graph oracle query to the graph π −1 (G) can be performed using two oracle queries to π, π −1 .Now we can specify the algorithm A ′ : first, it samples a graph G from G-this step is not time-efficient, but it makes no oracle queries.Next, it runs A on the graph π −1 (G), simulating oracle queries to this graph as described above.It answers according to the outcome of A. If A successfully distinguishes G ❀ and G F , then A ′ distinguishes Π ❀ and Π F with the same probability, and using twice as many oracle queries, as claimed.Thus, by Lemma 20, we obtain the claimed query bound for A.

Improving to more general permutations
While Lemma 20 and Corollary 21 are simple enough to prove, they are insufficient at proving indistinguishability for the graph distributions H ❀ and H F defined at the start of this section.This is because, unlike the distribution G ❀ , the distribution H ❀ cannot be defined in terms of independently sampling a graph G and a set S. For one, the sizes of the connected components in H ❀ do not exactly equal z; instead, the concentrate tightly around z.It was precisely the independence of the graphs and sets that made Corollary 21 easy to prove.
To fix the argument, we prove the following variations of Lemma 20 and Corollary 21.For a sunflower ❀ with core F and any ❀ be the distribution formed by the following procedure: 1. Sample a set S from ❀.
2. Sample uniformly randomly a subset C ⊂ S of size k.

Define the distribution Π (k)
F similarly where we change the first step to sampling from F .

Lemma 22. Any quantum query algorithm
F where ❀ is a (µ, ζ, t)-sunflower and F is the corresponding core requires Proof.This proof is equivalent to that of Lemma 20 except we use U = [k].Note, the listed bound has no dependence on k; this is because k ≤ ζ and we express here the weaker bound with ζ.
Likewise, a short corollary of Lemma 22 is the following.Construct the distribution G

Sample a permutation (π, π
where r G is the random coin flips that would have generated G when sampling according to P M,ℓ .The oracle will be divided into a queryable component of (π −1 (G)) and a un-queryable component of r G .
Corollary 23.For δ < 1/4, any quantum query algorithm F where ❀ is a (µ, ζ, t)-sunflower and F is the corresponding core requires Proof.The corollary follows from Lemma 22 via a reduction from permutations to graphs exactly as in the proof of Corollary 21 from Lemma 20.

Completing the proof
Proof of Lemma 19.Notice that for any k ❀ , and likewise for G ❀ is equal to the support of H ❀ as described in the statement of Lemma 19.Likewise, for H F .Since H ❀ and H F are uniform distributions over their support, by Corollary 14 using the relation R that we have constructed, the distributions are indistinguishable without the stated number of queries.

Statistical indistinguishability between random distributions
The final step of this section is to show that no algorithm can distinguish the distributions H F and P M,ℓ (F ) with more than a negligible probability.This will be because these distributions are statistically close and this can be proven by a Chernoff tail bound.

Lemma 24. The statistical distance between H F and P
Proof.Notice that the distribution H F is equivalent to sampling a graph from P M,ℓ (F ) conditioned on consisting of ℓ connected components each with size ∈ [(1−γ)z, (1+γ)z].By Lemma 10, with all but O(N −3 ) probability, a graph from P M,ℓ (F ) satisfies this condition.Therefore, the statistical distance between these distributions is bounded by O(N −3 ).

Polynomial method lower bound
In this section, we prove that any quantum query algorithm cannot distinguish the graph distributions P M,1 and P M,ℓ (F ).When F = ∅, this is equivalent to the problem studied by [6] in their quantum query lower bound: Theorem 25 (Restatement of Theorem 2 of [6]).For any sufficiently small constant ϵ 1 > 0, any deterministic quantum query algorithm A distinguishing the distributions P M,1 and P M,ℓ for any 1 < ℓ < N 1/4 by probability ϵ 1 .I.e.

E G←P M,1
must make at least Ω(N 1/4 / log N ) queries.Here the Ω notation hides a dependence on ϵ 1 .
The proof used in that result is very technical and builds on the polynomial method.Fortunately, we can show our query lower bound via a reduction to the [6] result.The reduction requires taking a short walk which mixes well by the expander mixing lemma.
Lemma 26.Suppose there exists some F 0 and a q 1 -query quantum algorithm that ϵ 1distinguishes the distributions P M,1 = P M,1 (F 0 ) and P M,ℓ (F 0 ) for ℓ > 1.Then there exists a q 2 -query quantum algorithm that ϵ 2 -distinguishes the distributions P M,1 and P M,ℓ with Intuitively, what this lemma says is that the set of points F 0 (which are in the same connected component) is not a helpful witness.Concretely, such a witness is negligibly more helpful than no witness at all.This is because, in the case of P M,1 or P M,ℓ , the connected components are expanding and therefore the verifier can easily select a random subset of the points from a single connected component without any assistance from the prover.This can be shown via an application of the expander mixing lemma.Therefore, if a query algorithm exists for distinguishing P M,1 and P M,ℓ (F ), it can be used as a subroutine for distinguishing P M,1 and P M,ℓ without any witness.
Furthermore, due to Ambainis, Childs, and Liu [6], we know Theorem 25 -i.e. that distinguishing the distributions without witnesses has a query lower bound.Therefore, the problem has a query lower bound even when a set of points F from a connected component are provided: Corollary 27.For any F 0 with |F 0 | ≤ N 1/100 , any sufficiently small constant ϵ 1 , and any ℓ with 1 < ℓ < N 1/4 , any quantum query algorithm to ϵ 1 -distinguish P M,1 and P M,ℓ (F 0 ) must make Ω(N 1/4 / log N ) queries.
Proof of Corollary 27.Suppose an algorithm making q = o(N 1/4 / log N ) queries existed.Then by Lemma 26 there exists an algorithm making q ′ = q + O(N 3/100 ) = o(N 1/4 / log N ) queries that distinguishes between P M,1 and P M,ℓ as well.However, this is impossible by Theorem 25.
The remainder of this section is the proof of Lemma 26.
Let A 0 be the hypothesized algorithm making q 1 queries to ϵ 1 -distinguish P M,1 and P M,ℓ (F 0 ).We first claim that for any F with |F | = |F 0 |, there exists an algorithm A 1 that, given as classical input a list of all the vertices in F , and as oracle input an oracle G where G is a sample from either P M,1 or P M,ℓ (F ), can ϵ 1 -distinguish between these two cases using q 1 queries to G. The algorithm A 1 is as follows: 1. Given F , compute a permutation π on V that maps F to F 0 .(This step is not efficient in terms of runtime, but makes no queries to the oracle G.) 2. Run A 0 with every query to G replaced by a query to π(G).Return the answer given by A 0 .
The correctness of the algorithm follows from the fact that π maps the distribution P M,ℓ (F ) exactly to P M,ℓ (F 0 ).Therefore, for all As this holds for all such F , Next, we will show that the input of F can be removed from the algorithm: given just access to G, it is possible to compute a suitable F without making too many queries to the oracle.Specifically, we define the algorithm A 2 to distinguish between P M,1 and P M,ℓ given only oracle access to G.
1.For a choice of t to be defined later, construct a set F 1 by starting at a random vertex v 0 and taking a 100t • N 1/100 -step random walk along the graph as described in Lemma 3. If |F 1 | ≥ |F |, pick the first |F | points from F 1 as the set F ′ .If not, output 0 (i.e.abort).
2. Run A 1 on input F ′ with oracle access to G.
We will argue that for an appropriately chosen t, this algorithm achieves the success probability and query complexity claimed in the theorem.To do so, we will argue in two stages.
1. First, we argue that the distribution of F ′ chosen by random walk is very close to F ′ chosen uniformly at random from subsets of a connected component of G.This analysis uses the expander mixing lemma.
2. Second, we argue that the distribution over pairs (G, F ′ ) obtained after the first step of A 2 is statistically indistinguishable from the distribution over pairs (G, F ) sampled by first choosing a uniformly random F ⊆ V and then choosing a random G ← P M,ℓ (F ).This will make use of Lemma 11, shown in Appendix A. By eq.(58), the algorithm A 1 can ϵ 1 -distinguish inputs distributed in this manner, and thus the second step of A 2 can ϵ 2 -distinguish inputs of P M,1 and P M,ℓ for ϵ 2 just slightly smaller than ϵ 1 .
In our analysis, we will denote probabilities over the distribution of (G, F ′ ) generated by A 2 by Pr

From random walk sampling to uniform sampling
Henceforth, define expander walk sampling as the sampling procedure of selecting a uniformly random vertex as the initial vertex v 1 , then subsequently taking t steps of a lazy random walk (as defined in the expander mixing lemma, Lemma 3) to choose v 2 , and so forth.In this case, the graph and the integer t will be clear from context.
We start by showing a sequence of claims that establish that if the expander walk sampling procedure for generating F ′ starts in a connected component C of G with size |C| = K and expansion α, then the distribution over sets F ′ generated by the random walk is close to uniformly sampling points from C. Our main result here will be Claim 30.
Claim 28.Let δ = (1 − α/2) t and let r be a natural number with rKδ < 1.The for any sequence of r vertices v 1 , . . ., v r , the probability Pr Proof.The proof is by direct calculation and application of the expander mixing lemma.First we show one side of the bound.
By hypothesis Kδ < 1.Using this, we have the estimate Substituting this bound into Equation (60c), we obtain Now for the other side.Again, applying the expander mixing lemma, where in the last step we have used Bernoulli's inequality and the assumption that Kδ < 1.
The following claim will be used to bound the probability that the expander walk sampling procedure aborts, by instead bounding the probability that iid sampling fails to generate enough distinct points.

Claim 29. The probability that
In our setting K ≥ (1 − γ)z ≥ N 0.5 and T = 100|F | = 100N 1/100 , so for sufficiently large N we have So the expected number of vertices that are sampled is Moreover, each even X v is independent, so the number of sampled vertices concentrates well around its mean.By a Chernoff bound we have Setting ϵ = 0.5 we have We now combine these two claims and apply them to our setting.Define the event is the set of vertices selected from the graph G. Define the distribution Pr unif [•] corresponding to first choosing a connected component C with probability proportional to |C|, taking r uniform iid samples from the connected component C and setting F ′ to be the first |F | distinct sampled points.Likewise, define the distribution Pr A 2 corresponding to choosing a random vertex v in G, taking C to be the connected component containing v, taking r samples according to an expander random walk in C initialized at v with t steps between samples, and then setting F ′ to be the first |F | distinct sampled points.Set K to be the maximum size of a connected component in G and let δ = (1 + α/2) t .Then we have the following distance bound between the distributions.Claim 30.Suppose r, K, δ are such that rKδ ≤ 10/11.For any
Proof.For a sequence v 1 , . . ., v r of vertices, let X v 1 ,...,vr→F ′ be the event that where we have used Claim 28 to bound the difference in probabilities of each sequence of vertices.Now, to bound the abort probability, let X v 1 ,...,vr→abort be the event that v 1 , . . ., v r contain fewer than |F | distinct vertices.
where we have used Claim 28 to replace Pr A 2 [•] by Pr unif [•] and then Claim 29 to bound the abort probability of uniform sampling.

From Pr
We will now proceed to the main argument showing that the pairs (G, F ) sampled by A 2 are distributed close to the distribution expected by A 1 .

G has expanding components with high probability
To start off, first note that by Lemma 10, with probability at least 1 − O(N −3 ) a graph drawn from P M,ℓ (F ), for any F of size ≤ N 1/100 , will consist of ℓ connected-components which are α-expanders and have size between Since ϵ 1 is a constant, for sufficiently large N , we can restrict to the situation that the graph is of this form and account for this factor in the end.Henceforth set K 0 = (1 + γ)z; we are guaranteed that every component has size at most K 0 .
Relating the probabilities In the case that each connected component is an α-expander, observe that the probability of every valid pair (G, F ′ ) is approximately a constant p independent of G and F ′ .Also recall that the event F ′ ← G is the event that F ′ is the set of vertices selected from G.Moreover, recall the distributions D 1 and D 2 from Lemma 11, and notice that D 2 is exactly the distribution Pr F then G defined above.We define We will now start with the Pr A 2 distribution and bound its distance from Pr F then G .
We may now bound the total variational distance between the two sides.
A total distance bound of O(N −9/200 ) can be achieved if So setting t = Θ(N 0.02 ) is sufficient.Given this choice of t, let us know calculate the chance that the sampling of F ′ aborts.By Claim 30, this is at most 10rKδ + exp(−T /16) = O(N −9/200 ) + exp(−100N 0.01 /16) = O(N −9/200 ) Thus, the total error probability of A 2 equals the error probability of A 1 up to ) as claimed in theorem.And the total query complexity assuming not aborting can be calculated as follows.Recall that t was chosen to be Θ(N 0.02 ).
The total number of additional queries over A 1 is thus the number of steps in the walk which is 100N 1/100 • t ≤ O(N 0.03 ).Thus, this algorithm has total query complexity q + O(N 0.03 ) and distinguishes with probability ϵ 2 as claimed.
8 Wrapping up the proof of Theorem 1 First, we need to note that the distributions B S and P M,1 which we used as proxies for YES and NO instances are not fully supported on YES and NO instance graphs, respectively.However, they are very close.For every S ⊂ [N ] of size ζ, let B S be the restriction of the distribution B S (defined in Section 6.2) to graphs with ℓ connected components each consisting of between (1−γ)z and (1+γ)z vertices.By Corollary 17, the statistical distance between B S and B S is O(N −3 ).The YES instances for Theorem 1 are the { B S }.
We consider a single NO instance of P M,1 where P M,1 is the restriction of P M,1 to graphs which are α-expanders.The statistical distance between these two distributions is O(N −3 ) by Lemma 10.Furthermore, we can verify that the supports of P M,1 and B S are far apart in Hamming distance.Consider graphs G 1 and G ℓ from either support, respectively.Consider a connected component C from G ℓ .In the graph G ℓ , all the edges on C stay within G ℓ , but since G 1 is an α-expander and also a 1/10 2 -edge expander (see proof of Lemma 10)), then in G 1 , a ≥ 1/10 4 fraction of the edges emanating from C leave C. As this holds for all components C since |C| ≪ N/2, then the Hamming distance between the adjacency lists of G 1 and G ℓ is Ω(N ).As this holds for all graphs G 1 and G ℓ , then the Hamming distance bound between the supports hold.

QMA algorithm
For completeness, from Corollary 17, we know that the algorithm A QMA with witness state |S⟩ answers distribution B S with probability at least ≥ 1−O(N −1/20 ).For soundness, from Lemma 10, we know that P M,1 is an 1/(2 • 10 8 )-expander with probability ≥ 1 − O(N −3 ).Therefore, by Lemma 12, the algorithm A QMA accepts with probability at most By parallel repetition 9 • 10 6 = O(1) times, we yield a quantum algorithm with ≤ 0.01 soundness.

QCMA algorithm
We argue now that any QCMA algorithm with completeness 0.99 and soundness 0.01 either requires an exponentially long proof or an exponential number of quantum queries.This is done by arguing that any algorithm with a short proof and few queries cannot have such a large completeness and soundness gap.Assume, therefore, that there exists a QCMA algorithm with a q ≤ n • N 9 Concluding remarks 9.1 Relation to the Fefferman and Kimmel [13] construction One can think of the result stated in this work as applying the QCMA lower bounding techniques developed by Fefferman and Kimmel [13] to the expander distinguishing problem originally studied by Ambainis, Childs, and Liu [6].
At a high level, in the in-place permutation oracle QMA and QCMA separation of [13], the goal was to distinguish between permutations π : ) is mostly (2/3) supported on odd numbers from permutations mostly supported on even numbers.The original idea in Fefferman and Kimmel was that if the oracle π was provided as a classical oracle (an N n-bit list [π(1), π( 2 However, due to the index-erasure problem, a classical oracle for verifying that |ξ⟩ = |ξ ideal ⟩ would need to allow implementation of both π and π −1 .However, if the oracle π −1 is provided, then there is a BQP algorithm for this problem.Simply, pick a random j ∈ [ √ N ] and then check if π −1 (j) is odd or even.The solution in [13] was to define the oracle instead as an "in-place oracle" for π, meaning a unitary defined as j |π(j)⟩⟨j|.Then the verifier can verify that |ξ⟩ = |ξ ideal ⟩ and yet the BQP algorithm no longer holds.
Fefferman and Kimmel had to make one more modification to prove a QMA and QCMA oracle separation: they considered distributions over in-place oracles which mapped to the same ideal quantum witness |ξ ideal ⟩.This was because it seems to be beyond current techniques to prove classical lower bounds without forcing a large structured set of permutations to all share the same witness -otherwise, for all we know, there might be a mathematical fact about permutations which yields a short classical certificate for any individual permutation.So the oracle is defined as a distribution over unitaries -i.e. a completely positive trace preserving (CPTP) map.
Notice that this work takes much inspiration form [13]; the quantum witnesses for both our work and [13] are subset states and we also consider distributions over oracles with the same (or similar) ideal quantum witness.This is because we are unsure how to prove that there is no property of a specific regular graph which yields a short classical witness.We elaborate on why such an impossibility result is hard to prove in the next subsection.What our result principally improves on is that the underlying oracle can be a classical string instead of a unitary.

Difficulties in proving stronger statements
Recall that our QMA upper bound does not require the setup of distributions over oracles -it was only included to prove the QCMA lower bound.How much harder is it (or is it even possible) to prove a QCMA lower bound without considering distributions?
As pointed out to us by William Kretschmer [21], if one considers average-case algorithms instead of worst-case algorithms, then this problem is ∈ RNP G , the average-case analog of NP G .This is because the average-case version of the expander distinguishing problem is to distinguish the distributions P M,ℓ and P M,1 .And there is a simple randomized algorithm for this problem with a classical witness.Let us recall that for a d-regular graph, the expected number of triangles in a connected component is Θ(d 3 ) independent of the number of vertices in the component.A similar analysis can be done for P M,ℓ and P M,1 , to show that a random graph from P M,ℓ has Θ(ℓd 3 ) triangles whereas P M,1 has Θ(d 3 ) triangles.Therefore, a classical witness for the statement that the graph (with high probability) is drawn from P M,ℓ (instead of P M,1 ) is a list of 100 • Θ(d 3 ) triangles from the graph.This witness is easily verifiable and correctly distinguishes with high probability.
Notice that this RNP G algorithm does not solve the expander distinguishing problem in the worst-case; since graphs exist in both distributions which are triangle-free (with constant probability).Furthermore, it cannot distinguish the distributions considered in Theorem 1 because the proof relies on finding triangles which is property of the graph not deducible from only knowing the connected components.
But it does highlight a principal roadblock in extending Theorem 1 to distinguishing oracles that are not distributions.It is entirely possible that there exists a property of graphs revealed by looking at the edges that distinguishes graphs with many connected components from graphs with a single expanding connected component.To the best of our knowledge, we do not know of any such property but proving that none exist is beyond the techniques shown here.
Lastly, if we consider the expander distinguishing problem when in the YES case we are promised that every connected component has size at most 0.99N , then this problem is in coAM G .When the graph is a NO instance, the verifier can select two random points and the prover can always find a path of length O(log N ) = O(n) between the two.However, when the graph is disconnected and no component is too big, with probability ≥ 1/50, no path exists.
Therefore, our constructed oracle very finely separates the classes QMA and QCMA in the sense that small perturbations of the problem might be very easy. is We will soon prove that Assuming eq. ( 83), we can bound the probability by using two bounds, one for small i and one for large i.For small i -i.e.whenever (1 + β)i ≤ 6γz, then g(i, βi) ≤ (8γ) di 2 and so Assuming β < 4, then d 20 > 9 10 (1 + β), and therefore, for sufficiently large N , r ≪ 1 2 .So this geometric series is bounded by For large i -i.e.whenever (1 + β)i > 6γz and i ≤ ζ/2, note that Therefore, where in the last line we used that i < ζ/2 and ζ = (1 + γ)z so for sufficiently large N , i/z ≤ 51/100.For choice of β = 1/100 and d = 100, we get that Then by adding eq. ( 85) and eq.( 89), and seeing that clearly eq.( 85) dominates for sufficiently large N , for choice of β = 1/100, Consider the graph on B k induced plus the self-loops introduced.This graph is d-regular (including self-loops), then by Cheeger's inequality if it is a β-edge expander then for β = 1/100 and d = 100, it is a α-spectral expander with a second normalized eigenvalue of at most We can now perform one more union bound over both the probability that every V k is of near-optimal size (eq.( 81)) and over all k ∈ [ℓ] that B k is not an α-expander.Since ℓ = N 1/10 , this overall probability is O(N −3 ) as stated in the statement.It remains to prove eq. ( 83).Consider disjoint sets U 1 , U 2 contained in B k where |U 1 | = i and |U 2 | = βi.Then, the event E U 1 ,U 2 is equivalent to the event that every edge emanating from U 1 is contained in U 1 ∪ U 2 .Since the graph G from P M,ℓ (F ) is built by considering a graph G ′ on M vertices built of d perfect matchings and then considering the induced graph on N vertices under the injective map ι : V → V ′ , then this implies that every edge emanating from ι(U 1 ) is contained in For each of the d matchings (corresponding to a different color), the probability that all κ-th colored edges emanating ι(U 1 ) lie in A is Since the d edge colorings are independently sampled, then the net probability is bounded by eq. ( 83).
Lemma 11.Let m be ≤ N 1/100 .Let D 1 be the distribution on pairs (G, F ) obtained by sampling G ∼ P M,ℓ , choosing a uniformly random vertex v ∈ G, and then choosing F to be a uniformly random subset of the connected component of G containing v of size m.Let D 2 be the distribution on pairs (G, F ) obtained by first choosing F to be a uniformly random subset of V with size m, and then sampling G ∼ P M,ℓ (F ).Then these distribution are close in statistical distance: Proof.In order to prove this lemma, it is illustrative to decompose the procedure for sampling the distributions P M,ℓ and P M,ℓ (F ).Let us note that while the procedure is stated sequentially, there are three different independent components.First, the sampling of the graph G ′ on M vertices is independent of the construction of the injective map ι.Furthermore sampling ι is constructed by independently sampling k : V → [ℓ] and then defining the injective map ι : V → V ′ by assigning each vertex j to a vertex in V k(j) without replacement.An equivalent sampling algorithm is to sample uniformly random independent injective maps π 1 , . . ., π ℓ where each π i : [M/ℓ] → V i is an enumeration of the vertices of V i .Then, define ι by using the random enumerations {π i } to sequentially assign each vertex j a vertex in V k(j) .Formally, if j is the s-th vertex such that k(j) = i; then ι(j) = π i (s).Therefore, the sampling procedure can be rewritten as a deterministic process based on the three samples: the graph G ′ , the map k, and enumerations {π i }.Notice that the only difference between P M,ℓ and P M,ℓ (F ) is the map k.Notice, we can further simplify by thinking only of the map k ′ : V → {0, 1} where k ′ = 0 if k ̸ = 1.This is because we can sample a uniformly random function k Let us say that k ′ : V → {0, 1} is a uniformly random map if each k(j) for j ∈ V is an iid random variable with Pr[k ′ (j) = 1] = 1/ℓ.Now, let D ′ 1 be the distribution on (k ′ , F ) defined by sampling a uniformly random map k ′ and then uniformly randomly a set F of size m from (k ′ ) −1 (1).Let D ′ 2 be the distribution on (k, F ) formed by sampling a uniformly random set F of m vertices and then a uniformly random map k ′ such that k ′ (F ) = 1.By the previously stated decomposition of the sampling procedures for D 1 and D 2 , if we show that D ′ 1 and D ′ 2 are statistically indistinguishable, then D 1 and D 2 are at least as indistinguishable.
We first remark that these distributions are not the same as the expected Hamming weight of a vector k ′ from D ′ 1 is N/ℓ whereas the expected Hamming weight of the vector from D ′ 2 is m + (N − m)/ℓ.To show, however, that these distributions are statistically close, we will use some simple Chernoff bounds and Pinsker's inequality.Let E be the event that the sampled vector k ′ has expected Hamming weight either < (1 − γ)z or > (1 + γ)z.Let a 1 and a 2 be the probabilities of the event E over distributions D ′ 1 and D ′ 2 .Since m ≪ z, a simple Chernoff bound shows that both a 1 and a 2 are at most Let D ′′ 1 and D ′′ 2 be the distributions conditioned on the event ¬E, respectively.Therefore, the statistical distances are at most We now bound the statistical distance between D ′′ 1 and D ′′ 2 using Pinsker's inequality and a bound on the KL divergence.We calculate the probability of outputting (k ′ , F ) under both distributions.First, The (1 − a 1 ) −1 term is due to the conditioning on event ¬E.The next two terms give the probability of sampling k ′ as each index is set to 1 with iid probability 1/ℓ.The last is the probability of selecting the specific set F given k ′ .Second, Likewise, the (1 − a 2 ) −1 term is due to the conditioning on event ¬E.The next term is the probability of sampling F .The last two terms yield the probability of sampling the rest of k ′ .Then, as a sub-calculation in the KL divergence, Here we used ln 1 1−x ≤ 2x twice and eq.(99d) follows from |k ′ | ≥ (1 − γ)z since we condition on the event ¬E and eq.(99f) follows from m ≪ γz.To pass from eq. (99g) to eq. (99h), we used the fact that 2α 1 ≪ mγ ≤ N −9/100 .Therefore, the total KL divergence between D ′′

B Ommited proofs for the adversary method
Every oracle query algorithm using Hilbert space H A can be adapted into a "ficticious" oracle query algorithm using Hilbert space H A ⊗ H I where an oracle gate |i⟩ → (−1) a i |i⟩ is turned into a gate |i⟩ |a, b⟩ → (−1) a i |i⟩ |a, b⟩ where the second register is H I .this allows us to analyze how the algorithm would behave on a superposition over oracles.
Assume an algorithm A makes T queries to the oracle.For 1 ≤ t ≤ T , let ρ t = tr H A (|ψ t ⟩⟨ψ t |) be the reduced density matrix (for corresponding pure state |ψ t ⟩) of the computation on the register H I immediately before the t-th oracle gate.We assume the algorithm from an initial starting state of |0⟩ H A ⊗ |χ 0 ⟩ H I evolves to immediately before the application of the first oracle query.The convenient start, as suggested by Ambainis, is to consider the algorithm run with In this case7 , for x ∈ X and y ∈ Y , Like Ambainis, define Since x ∈ X good and y ∈ Y good , then ϵ 0 ≤ ϵ and ϵ 1 ≤ ϵ.Then where eq.(110c) follows from the Cauchy-Schwarz inequality.Combining eq.(108d) and eq.(110e) gives eq.(105).
We will next show that the difference S t − S t+1 is minimal: where i ∈ [N ] is the index of the input variable being queried, r is the answer bit for the query, z are the bits not involved in the query or answer, and x is the oracle on N + M bits being queried.Immediately after querying the oracle, the state of the algorithm is precisely

Figure 1 :
Figure 1: List of known oracle separations

Figure 2 :
Figure 2: Cartoon of interaction between Prover and Verifier for a distribution over classical boolean functions.

3 .
For every x = (a, b) ∈ X and i ∈ [N ], let ℓ x,i be the number of y = (c, d) ∈ Y such that (x, y) ∈ R and a i ̸ = c i .Likewise, for every y = (c, d) ∈ Y and i ∈ [N ], let ℓ y,i be the number of x = (a, b) ∈ X such that (x, y) ∈ R and a i ̸ = c i .Let ℓ max be the maximum product ℓ x,i ℓ y,i over (x, y) ∈ R and i ∈ [N ] such that a i ̸ = c i .Then any quantum algorithm A which only queries the first N bits of the oracle and computes f such that

)
Proof.By Lemma 10, with all but O(N −3 ) probability, a graph G drawn from the distribution P M,ℓ consists of ℓ connected components each with size∈ [(1 − γ)z, (1 + γ)z].Then every connected component of the graph G is contained in some connected component S of size (1 + γ)z.By the symmetry of the distribution P M,ℓ under permutations, it follows that the distribution P M,ℓ formed by sampling a subset S of size (1 + γ)z and then sampling a graph from the distribution B S , is O(N −3 ) close to the distribution P M,ℓ .Therefore, with probability all but O(N −3 ), for any S, the distribution B S will consist of ℓ connected components each with size ∈ [(1 − γ)z, (1 + γ)z].By the previous corollary, the algorithm with witness |S⟩ accepts with probability ≥ 1 − 3 √ γ.A union bound completes the proof.

❀
by the following procedure: 1. Sample a graph G from the restriction of the distribution P M,ℓ to graphs with a connected component of exactly [k].

F
. Let us again abuse notation and use G (k) ❀ to denote the support of the corresponding distribution.For each k, the lower bound from Corollary 23 is shown via an adversary bound with a relation R k , and parameters m, m ′ , ℓ max , and moreover these parameters are the same for all k.Thus, we may construct a relation R between k G (k) ❀ and k G (k) F by simply taking the union R = k R k .This relation maintains the same parameters m, m ′ , ℓ max due to the disjointness of supports for different k.Lastly, notice that k G (k)

A 2 [
•] and probabilities over the distribution of (G, F ) obtained by first sampling F ⊆ V , and then sampling G ← P M,ℓ (F ) by Pr F then G [•].The notation Pr unif [•] denotes the distribution over F obtained by first picking a uniformly random vertex v in G, and then picking F to be a uniformly random subset of the connected component of G containing v with size |F 0 |.
unif [•] that this sequence was obtained by iid random sampling and the probability Pr walk [•] that it was obtained by expander walk sampling differ by Pr ), . . ., π(N )]) then the subset state |ξ ideal ⟩ = |π −1 ([ √ N ])⟩ would be a good quantum witness.By measuring the last qubit of a witness |ξ⟩, the verifier can decide if the set π −1 ([ √ N ]) is supported mostly on either odd numbers or even numbers.What remains to verify is that the witness |ξ⟩ provided is indeed |ξ ideal ⟩.The hope would be to use the oracle for π to verify the statement as the sate |[ √ N ]⟩ can be easily verified by measuring in the Hadamard basis.
For any algorithm A achieving eq.(33), let X good and Y good be the set of x and y, respectively, such that PrA [A x = 0] and Pr A [A y = 1] are ≥ 1 − ϵ.By Markov's inequality, |X good | ≥ (1 − ϵ)|X| and |Y good | ≥ (1 − ϵ)|Y |.For any (x, y) ∈ X good × Y good , we will show that (ρ T ) xy ≤ ϵ(1 − ϵ) |X||Y | .(105)To show eq.(105), let |ψ x ⟩ and |ψ y ⟩ be the final states of A when run with inputs |x⟩ and |y⟩ in register H I , respectively.Then, the final state of the algorithm after T queries will be|ψ final ⟩ = 1 2|X| x∈X |ψ x ⟩ |x⟩ + 1 2|Y | y∈Y |ψ y ⟩ |y⟩ .(106)Take a basis of H A of the form |z⟩ |v⟩ where |z⟩ corresponds to the answer bit and |v⟩ is a basis for the remainder of the work register.In this basis, let|ψ x ⟩ = z,v α z,v |z⟩ |v⟩ , |ψ y ⟩ = z,v β z,v |z⟩ |v⟩ .(107)Since the algorithm A cannot effect the amplitudes of the oracle string in H A , then (ρ T ) xy = ⟨y| ρ T |x⟩ (108a) = z,v ⟨z, v, y|ψ final ⟩ ⟨ψ final |z, v, x⟩
with core F and a f -query deterministic quantum algorithm A that accepts each distribution B S for S ∈ ❀ with probability ≥ 0.99 and accepts P M,1 with at most ≤ 0.01 probability.It also accepts that accepts each distribution B S for S ∈ ❀ with probability ≥ 0.99 − O(N −3 ) and accepts P M,1 with at most ≤ 0.01 + O(N −3 ) probability.Then with the assumed number of queries, we can apply Lemma 19 with δ = 1/10 to argue that A must accepts the distribution H Ω F with probability ≥ 0.09 − O(N −3 ).Next, by Lemma 24, A must accept the distribution P M,ℓ (F ) with probability ≥ 0.09 − 2 • O(N −3 ) ≥ 0.08.We conclude by applying Corollary 27.Therefore, A must accept the distribution P M,1 with probability > 0.02, a contradiction.