Concentration bounds for quantum states and limitations on the QAOA from polynomial approximations

We prove concentration bounds for the following classes of quantum states: (i) output states of shallow quantum circuits, answering an open question from [DPMRF22]; (ii) injective matrix product states; (iii) output states of dense Hamiltonian evolution, i.e. states of the form $e^{\iota H^{(p)}} \cdots e^{\iota H^{(1)}} |\psi_0\rangle$ for any $n$-qubit product state $|\psi_0\rangle$, where each $H^{(i)}$ can be any local commuting Hamiltonian satisfying a norm constraint, including dense Hamiltonians with interactions between any qubits. Our proofs use polynomial approximations to show that these states are close to local operators. This implies that the distribution of the Hamming weight of a computational basis measurement (and of other related observables) concentrates. An example of (iii) are the states produced by the quantum approximate optimisation algorithm (QAOA). Using our concentration results for these states, we show that for a random spin model, the QAOA can only succeed with negligible probability even at super-constant level $p = o(\log \log n)$, assuming a strengthened version of the so-called overlap gap property. This gives the first limitations on the QAOA on dense instances at super-constant level, improving upon the recent result [BGMZ22].


Main results
We extend the polynomial-based method in two ways: we show that in cases where the moment method produces Gaussian concentration, the polynomial-based method can do so, too; and we show that the polynomial-based method can be applied to a much wider class of states, including the output states of shallow quantum circuits, injective matrix product states, and the output states of dense Hamiltonian evolutions (explained below). An example of a dense Hamiltonian evolution is the QAOA for solving classical constraint optimisation problems (COPs). We therefore obtain concentration bounds for the QAOA, and combining this with the so-called overlap gap property first introduced in the classical literature [GL18,GJ21], we prove strong limitations on the performance of the QAOA even at (admittedly only slightly) super-constant level p = o(log log n). Crucially, our method works for dense COPs, which may have constraints between any variables, and our proofs are fairly straightforward. This improves upon the recent work [BGMZ22], which proved similar results for constant-level QAOA on dense instances using a highly technical proof.
We now describe our main results in more detail. In all cases, concentration bounds are obtained by approximating the quantum state of interest by a local operator, so in sketching the proof ideas, we only focus on the construction of such a local approximation. Below, we only give concentration bounds for the Hamming weight distribution W ρ of an n-qubit quantum state ρ, i.e. the probability distribution over {1, . . . , n} describing the Hamming weight of a computational basis measurement of ρ. More formally, Pr[W ρ = i] = x∈{0,1} n :|x|=i x|ρ|x . However, it is straightforward to extend these bounds to other observables that only change slowly as the Hamming weight changes; for an example, see Corollary 4.8.  Table 1: Main results and comparison with prior work. For each case, we consider a state on n qubits and bound the probability that the Hamming weight of a computational basis measurements deviates by more than k ∈ [0, n] from its median (or mean). Note that [BGMZ22] were able to show concentration over both the choice of instance and the randomness of the QAOA output, whereas our bounds, while stronger, only deal with concentration over the latter.

Shallow quantum circuits.
Consider a depth-t quantum circuit, i.e. a circuit comprised of t layers of arbitrary 2-qubit gates applied to the initial state |0 ⊗n . We denote the unitary implemented by this circuit by U . The output state of this circuit is the unique maximum-energy eigenstate of a 2 t -local Hamiltonian that is a sum of commuting projectors. To see this, observe that U |0 ⊗n is the unique joint (+1)-eigenstate of the operators H i = U |0 0| i U † , where |0 0| i acts as identity on all qubits except i. By a standard lightcone argument, H i only acts non-trivially on 2 t qubits, so H = 1 n H i is a 2 t -local Hamiltonian with U |0 ⊗n as its unique (+1)-eigenstate. Therefore, the output of the circuit can be written as U |0 0| ⊗n U † = δ 1 (H), where δ 1 (1) = 1 and δ 1 (x) = 0 for x = 1. We can now approximate δ 1 (x) using a degree-d polynomial P d constructed in [KLS96,BCDZ99,AAG22]. Polynomials spread the locality of operators in a controllable way. We can therefore show that P d (H) is a (d·2 t )-local operator that approximates U |0 0| ⊗n U † . Hence, we have constructed a local operator approximation to U |0 0| ⊗n U † and can use this approximation to show that for any depth-t circuit, the output state |ψ ψ| = U |0 0| ⊗n U † has the following concentration property for k ∈ (2 t √ n, 2 t n) (see Corollary 3.2 for the formal statement): This generalises the Chernoff-Hoeffding bound for product distributions (which corresponds to the case of a single layer (t = 1), since then the measurement distribution of the output state is a product of distributions on one or two bits) and shows Gaussian concentration for any constantdepth quantum circuit, answering an open question from [DPMRF23]. Similar statements have also appeared in [AN22,AB22,ABN22]. Note that the same argument applies to any ground state of a Hamiltonian that is a sum of commuting projectors, not just output states of shallow circuits.
Injective matrix product states. Matrix product states (MPSs) are a widely used tensor network representation of quantum states. Injective MPSs have an additional property that ensures that they are the unique ground state of a local "parent Hamiltonian" with a constant spectral gap. We can therefore approximate an injective MPS as a polynomial of its parent Hamiltonian. Using near-optimal polynomial approximations constructed in [AAG22], we obtain Gaussian concentration bounds for injective MPSs (Lemma 3.3). Our bounds are stronger than previous ones [Ans16,KAAV17], which only showed exponential concentration. We also note that conditionally independent probability distributions can be encoded into injective MPSs, and that in that case our concentration bounds reproduce a (version of) Azuma's inequality.
Dense Hamiltonian evolution. Concentration bounds are natural for quantum states that have weak long-range correlations such as the output states of shallow quantum circuits. A priori, one would not expect similar bounds to hold for quantum states with long-range correlations. Recently, [BGMZ22] considered the output distribution of the QAOA (explained below) on random dense COPs, i.e. local COPs that can have constraints between any variables. For dense COPs, the operations implemented by the QAOA can include interactions between any qubits, and as a result the output distribution can have long-range correlations. Remarkably, [BGMZ22] showed that the variance of the average energy density (averaged over the randomness of the QAOA as well as the choice of random instance) vanishes asymptotically. This means that the energy density of the output, which corresponds to the quality of the COP solution produced by the QAOA, concentrates about the average. However, [BGMZ22] were only able to prove an asymptotic statement without explicit tail bounds and the proof was highly non-trivial. We consider a more general class of states that includes the output states of the QAOA as a special case. Specifically, we define the output of a dense Hamiltonian evolution as a state of the form e ιH (p) · · · e ιH (1) |ψ 0 for any n-qubit product state |ψ 0 . Here, each H (i) can be any commuting local Hamiltonian (though the different H (i) themselves are of course not required to commute). Importantly, H (i) are allowed to be dense Hamiltonians, i.e. Hamiltonians with interactions between any qubits. As explained below, the QAOA applied to a dense COP is a special case of dense Hamiltonian evolution.
For our concentration bounds, we further require each H (i) to satisfy a norm constraint, which limits the norm of the Hamiltonian restricted to a subset of the qubits (see Equation (4.1) for the formal statement). In particular, this condition is satisfied with overwhelming probability for the random dense model from [BGMZ22]. Under this condition, we can prove (Theorem 4.1) that the output state of a dense Hamiltonian evolution is -close in operator norm to a k p -local operator for Here, c 1 and 0 ≤ α < 1 are constants and p is level of the dense evolution, i.e. the number of unitaries e ιH (i) that have been applied. In particular, if we choose p = o(log log n), then k = o(n). This implies the following exponential concentration result (Corollary 4.7): for ρ p the output of a dense Hamiltonian evolution with level p = o(log log n) satisfying the above conditions, Here, we have only stated the asymptotic result, but in Corollary 4.7 we give explicit bounds for any choice of p. This concentration result can also be extended beyond just the Hamming weight of a computational basis measurement: for example, it also holds for the energy density of ρ p with respect to any classical Hamiltonian satisfying a similar norm constraint to the one mentioned above (Corollary 4.8).
To prove that the output of a dense Hamiltonian evolution can be approximated by a local operator, we again make use of polynomial approximations. Recall that we are interested in states of the form ρ p = e ιH (p) · · · e ιH (1) |ψ 0 ψ 0 |e −ιH (1) · · · e −ιH (p) for a pure product state |ψ 0 . As a first step, we approximate |ψ 0 ψ 0 | by a local operator (Lemma 4.5). For this, we observe that since |ψ 0 = ⊗ i |ψ 0 i is a product state, it is the unique ground state of the 1-local Hamiltonian H = 1 n |ψ 0 ψ 0 | i . If we apply a linear combination of Chebyshev polynomials to this Hamiltonian, we obtain a good local approximation to |ψ 0 ψ 0 |. Then, for each unitary e ιH (i) in the dense Hamiltonian evolution, we approximate the exponential function by its truncated Taylor series. The fact that the Hamiltonian evolution is applied to an approximately local operator allows us to use the norm constraint mentioned above to obtain an improved error bound for the truncated Taylor series (see Lemma A.2 for details). Therefore, the truncated Taylor series spreads the locality of the state in a controllable way without degrading the quality of the approximation too much. Applying this argument recursively for each layer of the dense Hamiltonian evolution, we obtain a local approximation to the output state ρ p .
Limitations on the QAOA from concentration bounds. The QAOA [FGG14] is an algorithm for solving local COPs (i.e. COPs consisting of any number of clauses, each with at most q = O(1) variables) on a quantum computer. We can associate a q-local Hamiltonian H with every q-local COP C by replacing the variables in C with Pauli-Z matrices acting on different qubits. The resulting Hamiltonian is diagonal in the computational basis and has the property that for any string x ∈ {0, 1} n , C(x) = x|H|x . The QAOA attempts to find a "good" solution x (i.e. one for which C(x) is as large as possible) by starting from the state |+ ⊗n and then applying p layers of unitaries of the form e ιβ i σ ⊗n X e ιγ i H . Here, β i and γ i are real parameters that can be tuned to the problem instance. It is clear that this is a special case of the dense Hamiltonian evolution we have described earlier. Our results will apply for any choice of β i and γ i and we will always implicitly consider a family of COPs, one for each number n of input bits, in order to make asymptotic statements.
[BGMZ22] considered the performance of the QAOA on a random spin model on n qubits, described by the q-local Hamiltonian where J i 1 ,...iq ∼ N (0, 1) are sampled from i.i.d. standard Gaussians. For this model, [BGMZ22] were able to show that for constant even q ≥ 4 and level p = O(1), the value achieved by the QAOA (for fixed β i , γ i ) in expectation over J and the internal randomness of the QAOA is bounded away from the optimal value by a constant as n → ∞. They were also able to show the asymptotic concentration property described above.
Here, we use our concentration results to show limitations on the QAOA for a class of COPs that includes the random spin model above. For this, we consider local COPs that have the socalled overlap gap property (OGP) [Gam21], which roughly says that "good" solutions to the COP are clustered in the sense that two good solutions are either close or far in Hamming distance. Combining this with our concentration results for dense Hamiltonian evolution, we can show that for any COP with a sufficiently strong OGP whose associated Hamiltonian satisfies Equation (4.1), if the QAOA produced a good solution with noticeable probability, then the probability distribution over good solutions produced by the QAOA would have to be concentrated on one such cluster. This allows us to show that the QAOA cannot succeed with noticeable probability on symmetric COPs (i.e. COPs that are invariant under flipping all the input bits) that have a strong OGP. This is because the symmetry of the COP is in contradiction with the existence of a single cluster on which most of the probability distribution is concentrated: if such a cluster existed, we could take the strings in that cluster and flip all their bits to produce another cluster which, by symmetry, must have the same probability weight, a contradiction. This argument is similar to [BKKT]. As a result, we obtain the following limitation on the QAOA (see Lemma 5.5).
Theorem (informal). Consider a local symmetric COP C(x) with a sufficiently strong OGP and suppose that the associated Hamiltonian H satisfies the norm constraint in Equation (4.1). Then, the value of the solution to C(x) produced by the QAOA with level p = o(log log n) is bounded away from the optimal value by at least a constant except with probability e −Ω(n 1/8 ) .
For even q, the random spin model from Equation (1.1) is symmetric and satisfies the norm constraint in Equation (4.1) with overwhelming probability. Furthermore, it was shown in [CGPR19] that it satisfies the OGP with overwhelming probability. However, we note that here we need a stronger version of the OGP than was shown in [CGPR19]. This stronger version appears to be implicit in their proof, too, although we leave its formal proof for future work and assume it here as a conjecture. Assuming this stronger OGP, we can show the following (see Section 5.3). This places strong limitations on the performance of the QAOA because it does not just bound the expectation value away from the optimal value as in [BGMZ22], but instead asserts that the QAOA output is bounded away from the optimal value with overwhelming probability, even at super-constant level p = o(log log n).

Discussion and open questions
We have shown that polynomial approximations can be used to derive Gaussian concentration bounds for the output states of constant-depth quantum circuits and injective matrix product states, and exponential concentration bounds for the output states of dense Hamiltonian evolution. The latter can be used to derive strong limitations on the performance of the QAOA at superconstant level p = o(log log n) even on dense instances such as random spin models.
At first sight, it is surprising that the (provably optimal) polynomial approximations [KLS96,BCDZ99] we use for shallow quantum circuits are able to reproduce the (likewise provably optimal) Chernoff-Hoeffding bound in the classical case. It would be interesting to explore whether there is a deeper conceptual connection between optimal polynomial approximations and optimal concentration bounds.
On a more technical level, there are a number of interesting improvements one could hope to make to our bounds. Firstly, our bounds for MPSs (Lemma 3.3) can only deal with sub-linear deviations k = O(n 1−δ ) for any δ > 0. It would be desirable to extend this result to arbitrary values of k. Additionally, one could hope to prove similar concentration bounds for PEPSs, the two-dimensional analogue of MPSs.
Secondly, we only achieve exponential, not Gaussian, concentration bounds for dense Hamiltonian evolutions with level p = o(log log n). Can one improve these results to Gaussian concentration and also extend them to higher levels, e.g. p = O(log n) or even p = O(n δ ) for a small δ > 0? Furthermore, we show concentration for the output states of dense Hamiltonian evolution for a fixed instance, but we cannot show that for random COPs, the output states also have concentration properties over the choice of random instance, e.g. over the choice of J i 1 ...iq ∼ N (0, 1) in the case of the random spin model introduced earlier.
[BGMZ22] do show such a concentration property, albeit only in the asymptotic regime without explicit bounds. Can our polynomial approximation techniques also be used to prove explicit concentration bounds over the choice of random instance? If so, it might be possible to extend the limitations on the performance of dense evolutions for COPs we prove in Section 5.2 beyond symmetric COPs and optimisers.
Finally, our techniques may also be useful for problems in condensed matter physics. As an example, consider the Lieb-Schultz-Mattis theorem [LSM61] and its higher-dimensional generalisation [Has04], seminal results in condensed matter physics. Their main idea is that sufficient symmetry and non-degeneracy of the ground space prevents a Hamiltonian from being gapped. Inspired by our application to symmetric QAOA (Section 5.3), we can ask whether an alternative proof of this result can be obtained using concentration bounds and polynomial approximations, e.g. by showing that the concentration properties of unique gapped ground states are in conflict with the symmetry requirements.

Local operators
Definition 2.2 (Local operators). Let k ∈ N and > 0. An operator R ∈ L(H) is called k-local if it can be written as R = i R i for operators R i that only act non-trivially on k subsystems. Whenever we write R = i R i for a k-local operator R, this is understood to be such a local decomposition. An operator Q ∈ L(H) is called (k, )-local if there exists a k-local operator R such that Q − R ≤ .
We will frequently consider local operators with additional properties and extend the above definition in the obvious way: for example, a (k, )-local state is a quantum state that is -close in operator norm to a k-local operator R. Note that the operator R does not need to be a quantum state itself.

Definition 2.3 (Total local norm). A k-local operator
Note that in the definition of tln(R) and tln (Q), it always has to be clear from the context which locality k we are considering, since for different choices of k, different values of the total local norm can be achieved. Therefore we will always make statements of the form: R is a k-local operator with tln(R) = . . . (and likewise for the approximate case). Hence, strictly speaking the subscript in tln (Q) is unnecessary as it must anyway be specified that Q is a (k, )-local operator for some values of k and , but we find it useful to include the subscript as a reminder of this nonetheless.
Example 2.4. The locality of an operator increases in a controllable way when we apply a polynomial. Specifically, consider a k-local operator R = R i and a degree-d polynomial P . Then, P (R) is a (d · k)-local operator because we can expand it into a sum of terms, each of which contains a product of at most d different local terms R i .
Example 2.5. The state |+ +| ⊗n is a non-local operator. However, it can be approximated by a local operator. For this, consider the Hamiltonian H 0 = 1 n n i=1 |+ +| i , where |+ +| i acts as identity everywhere except on the i-th qubit. H 0 is a 1-local operator, with ground state |+ +| ⊗n and spectral gap 1/n. It is easy to see that ..i k |+ +| i 1 · · · |+ +| i k into n k terms each with operator norm at most 1, we also see that tln(H k 0 ) ≤ 1 for any k, and consequently tln (|+ +| ⊗n ) ≤ 1 for any . Definition 2.6 (Subset operator). For any k-local operator R = R i and a subset S ⊆ [n], we define the subset operator R S as The decomposition of a k-local operator R = i R i we have considered so far is not unique. It will occasionally be useful to define a canonical such decomposition. This can easily be done as follows: noting that the n-qubit Pauli matrices (with identity) form a basis of L(H), for any operator R ∈ L(H) there is a unique decomposition in terms of the n-qubit Pauli matrices. Furthermore, if R is k-local, only basis elements that act non-trivially on at most k qubits will appear in this decomposition. Starting from this unique decomposition, we can group basis elements that act non-trivially on the same set of qubits. This way, we obtain a unique decomposition of a k-local operator R as where O (T ) is a (weighted) sum of all Pauli operators that act non-trivially exactly on qubits in the set T . The following lemma bounds the operator norm of an individual term in this decomposition.
is the set of all multi-qubit Pauli operators that act as identity on S. By the inclusion-exclusion principle (see e.g. [GGL95, Thm. 12.1]), where the last equality holds because the Pauli twirl is a unital channel.

Polynomial approximations
Definition 2.8 (Chebyshev polynomials). The Chebyshev polynomials are defined as We have already seen the utility of the powering function x → x s in creating local operator approximations in Example 2.5. However, the powering function does not result in approximations with an optimal tradeoff between approximation error and locality. For this purpose, we require an approximation to the function x → x s as a linear combination of Chebyshev polynomials. Lemma 2.9. Let T k be the degree-k Chebyshev polynomial and p s,k the probability that an s-step symmetric random walk on integers (starting from 0) is at k or −k. For any a ≤ s, we define the degree-a polynomial

Then, P s,a provides a good approximation to x s in the sense that for any operator
Proof. As shown in [SV14, Theorem 3.1], the monomial x s can be viewed as a random walk over the Chebyshev polynomials. More precisely, for any s > 0 we have (2.2) By the Chernoff bound, k≥a p s,k ≤ 2e − a 2 2s . This means that the contributions from the high-degree Chebyshev polynomials in Equation (2.2) are suppressed, and we can obtain a good approximation to x s by truncating Equation (2.2) at a degree a < s. Specifically, for any operator O with O ≤ 1 we can bound Here, the first inequality holds by the triangle inequality and because |T k (x)| ≤ 1 for |x| ≤ 1.

Concentration bounds from local operator approximations
Our general strategy for proving concentration bounds on quantum states will be to show that these states can be approximated by local operators. From this we can obtain concentration bounds from the following lemma, adapted from [KAAV17]. We state this lemma for the Hamming weight distribution W ρ of a state ρ, but it can easily be generalised to the distribution of any function of bitstrings that varies slowly as the Hamming weight is changed (see Corollary 4.8 for an example). Proof. We only prove the first bound, the second is analogous. We define as the projector onto all computational basis states with Hamming weight greater than m + k. We define Π ≤m analogously. Because strings in the support of Π >m+k and Π ≤m differ on more than k positions, for any operator O that only acts non-trivially on k qubits we have that Π >m+k OΠ ≤m = 0. By linearity we also get Π >m+k RΠ ≤m = 0 for any k-local operator R.
Because ψ is (k, )-local, there exists a k-local operator R that is -close to ψ. We therefore get that By the variational definition of the operator norm and inserting ψ = |ψ ψ|: Proof. Letting R be the k-local operator -close to ψ, we can bound The last equality follows by the same reasoning as in the proof of Lemma 2.10. This implies the lemma.
3 Concentration bounds for shallow circuits and matrix product states [AAG22] showed that the ground states of various local Hamiltonians can be approximated by local operators using polynomials. Combining this with Lemma 2.10, we can easily obtain concentration bounds for such states.
As concrete examples, we prove concentration bounds for the output states of shallow quantum circuits and injective matrix product states (MPSs). We begin by considering the special case of ground states (or, for convenience, maximum-energy eigenstates) of commuting local Hamiltonians H = H i , where each H i is a projector.  [KLS96,BCDZ99]) construct multi-variate degree-d polynomials P d (with d ∈ ( √ r, r)) such that for any x 1 , . . . , x r ∈ {0, 1}: If we insert the -local projectors x i = H i , whose spectrum is {0, 1}, we get that As an application of this, we consider quantum circuits with arbitrary 2-local gates between any two qubits, arranged into t layers. The number of layers t is called the circuit depth. The key property of shallow circuits is that they spread locality in a controllable way: by a standard lightcone argument, it is easy to see that if O is a k-local operator and U is a unitary implemented by a depth-t circuit, then U OU † is a (2 t · k)-local operator. Therefore, the output state U |0 of the circuit is the maximum-energy eigenstate of the 2 t -local Hamiltonian H = i U (|0 0| i ⊗ 1 \i )U † , and the local terms of the Hamiltonian are clearly commuting projectors. Hence, we obtain the following corollary, answering an open question from [DPMRF23].

Corollary 3.2. Let U be a unitary implemented by a depth-t circuit. Then, the output state
|ψ ψ| = U |0 0| ⊗n U † is k, 2 − k 2 2 2t+8 ·n -local operator for k ∈ (2 t √ n, 2 t n). Furthermore, denoting the median of W ψ by m, the following Gaussian concentration bounds hold: A more general case considered in [AAG22] is that of a 1D local Hamiltonian H = H i , where 0 ≤ H i ≤ 1 and H needs to satisfy the local gap property (see [AAG22] for a definition). For such Hamiltonians, [AAG22, Theorem 3] gives local approximations to the ground state, and we can use this and our Lemma 2.10 to obtain concentration bounds. We do not spell out the full statement and instead consider a useful example, injective matrix product states (MPSs) with constant bond dimension. We refer to [CPGSV21] for an introduction to MPSs. For our purposes, the main property of injective MPSs is that they are the unique ground state of a "parent Hamiltonian" with constant locality spectral gap (see e.g. [CPGSV21, Section IV.C] for details). In fact, the proof of the spectral gap lower bound also implies a constant local gap lower bound [PGVWC07]. From this, the following result is immediate.

Lemma 3.3. Let |ψ be an injective MPS on a chain with n qubits with a constant bond dimension.
Then, for every δ ∈ (0, 1 4 ), |ψ ψ| is a k, e −c 1 (δ) k 2 n -local operator for k ≤ c 2 (δ)n 1−δ for some c 1,2 (δ) independent of n. Furthermore, denoting by m the median of W |ψ ψ| , An injective MPS can be used to encode conditionally independent distributions on a line. For this, consider a distribution P (x 1 , . . . x n ) such that P (x i |x 1 , . . . x i−1 ) = P (x i |x i−1 ) and assume that all these conditional probabilities are positive. Note that P (x 1 , . . . x n ) = P (x 1 )P (x 2 |x 1 ) . . . P (x n |x n−1 ). Then, it is easy to verify that the state x 1 ,... xn P (x 1 , . . . , x n )|x 1 , x 2 , . . . x n can be written as an injective MPS with constant bond dimension. The output distribution after a computational basis measurement is precisely P , which allows us to show Gaussian concentration using Lemma 3.3, reproducing a (version of) Azuma's inequality. In this sense, Lemma 3.3 can be understood as a quantum version of Azuma's inequality.

Concentration bounds for dense Hamiltonian evolution
In the previous section we considered the output states of shallow quantum circuits and showed that they can be approximated by local operators. A depth-t circuit can be written as a product of t unitaries U 1 U 2 . . . U t , where each U i is a tensor product of one-or two-qubit gates acting on disjoint sets of qubits. Each U i can also be written as a Hamiltonian evolution e ιH (i) , where H (i) is a 2-local Hamiltonian and each qubit participates in at most one local term. Therefore, we can rephrase the result from the previous section as saying that for any sequence of t 2-local Hamiltonians H (i) where each qubit participates in at most one local term, the state e ιH (t) · · · e ιH (1) |0 ⊗n can be approximated by a local operator. Since each local term in H (i) acts on different qubits, H (i) is obviously a commuting Hamiltonian.
In this section, we generalise this result to more general families of commuting Hamiltonians H (i) . In particular, we drop the requirement that each qubit can only participate in at most one local term of H (i) and instead allow local commuting Hamiltonians with dense interaction graphs, i.e. any qubit is allowed to participate in an arbitrary number of terms. Evolution under such dense Hamiltonians cannot be implemented by a shallow circuit. Nonetheless, we show that as long as such Hamiltonians satisfy a norm constraint explained in Equation (4.1), the output state e ιH (t) · · · e ιH (1) |0 ⊗n can still be approximated by a local operator just like for shallow circuits, and as a consequence also obeys concentration bounds. This is of particular relevance as quantum optimisation algorithms such as the QAOA apply an evolution of this form when applied to dense constrained optimisation problems (COPs). Therefore, our concentration bounds apply to the output of the QAOA for dense COPs, which previously required a highly technical analysis that only yielded an asymptotic statement without explicit bounds [BGMZ22]. We can also use these concentration bounds to prove limitations on the performance of the QAOA (and dense evolutions more generally) at solving COPs; see Section 5 for details.
More formally, let H (1) , H (2) , . . . H (p) be a collection of -local commuting Hamiltonians (i.e. the local terms in each Hamiltonian commute, but the different H (i) need not commute), where each qubit is allowed to participate in arbitrarily many local terms of each Hamiltonian H (i) . Define U i = e −ιH (i) . Because each Hamiltonian H (i) may have a dense interaction graph, we call U 1 · · · U p a dense Hamiltonian evolution with level p. To prove concentration bounds, we will require that there exist constants α ∈ [0, 1) andC > 1 such that for every i and every subset S ⊆ [n] of qubits, the subset Hamiltonian H (4.1) In the special case of sparse Hamiltonians, we get that Equation (4.1) is satisfied with α = 0. However, crucially, for random dense classical COPs, Equation (4.1) can still be satisfied with high probability. For example, in Section 4.3 we show that a class of random spin models satisfies Equation (4.1) with high probability even though it allows for constraints between any variables, i.e. its constraint (hyper-)graph is the complete (hyper-)graph. The well-known Sherrington-Kirkpatrick model is an example of such a spin model. We consider any pure product state ρ 0 and denote by the output of the dense Hamiltonian evolution. The purpose of this section is to show that the output state ρ p satisfies certain concentration properties even if p grows (slowly) with n. Our strategy for proving such concentration bounds will be to approximate the state ρ p by a local operator (Theorem 4.1). Once we have established such a local approximation, Lemma 2.10 immediately implies a concentration bound for the Hamming weight distribution W ρp (Corollary 4.7). In Corollary 4.8, we extend this to a concentration bound for the energy density of ρ p with respect to any classical Hamiltonian satisfying Equation (A.2).

Local approximations to output states of dense Hamiltonian evolution
We start by giving the main result of this section, a local approximation to the output state ρ p . The proof of this result proceeds inductively: we first show how to approximate the starting state ρ 0 by a local operator; then, we can analyse how the locality and approximation error evolves under application of the unitaries U i . We emphasise that the bounds in Theorem 4.1 are optimised for ease of use, not tightness; one can easily obtain a tighter final result by keeping around more parameters.
Remark 4.2. In general, the above bound is useful when k = o(n). For sparse COPs, where α = 0, the bound on k p simplifies to k p ≤ c p 1 n 3/4 . Therefore, we get k p = o(n) for p = o(log n). (Note that for sparse Hamiltonians, the circuit is in fact low depth, so one can alternatively use Corollary 3.2.) For dense COPs, where α > 0, we need p = o(log log n) for k p = o(n). For the remainder of this section, we will focus on dense Hamiltonians and always impose the requirement that p = o(log log n). Remark 4.3. As an example of why such a local operator approximation is useful, we observe that Theorem 4.1 combined with Lemma 2.11 gives a clustering property for the output of dense evolutions. Such a clustering property is used in the proof of the NLTS theorem [ABN22], which shows that the low energy states of recently discovered [PK22,LZ22] linear-rank and linear-distance quantum LDPC code Hamiltonians require Ω(log n) quantum circuit depth to prepare. Replacing [ABN22,Fact 4] with the clustering property for the output states of dense evolutions, we can show that if one were to try to prepare the low-energy states of LDPC codes using dense Hamiltonian evolution instead of shallow circuits, one would need at least Ω(log log n) levels of dense evolution.
Proof of Theorem 4.1. We invoke Lemma 4.4 shown below, which gives general (albeit complicated) expressions for k p and p . Setting c 1 = 40 · C , which is a constant by assumption, we immediately obtain the bound on k p . To bound p , we use the expression from Lemma 4.4: We first bound the exponent of the second term for j ∈ {1, . . . , p}: For p = o(log n) and sufficiently large n and p, we see that the first term −Cn 1−(1−α) j /4 is dominant for any j. Therefore, there exist constants c 3 , c 4 such that for sufficiently large n and p = o(log(n)).
As mentioned above, Theorem 4.1 is a simplification of the following technical lemma.  1) for some α ∈ [0, 1) andC such that n ≥ k C 1/α . Then, ρ p is (k p , p )-local for Proof. For i ≤ p we define the intermediate states ρ i in the obvious way, i.e.
We now claim that each ρ i is (k i , i )-local with We prove this by induction. The base case i = 0, i.e. the approximation of the starting state ρ 0 , follows from Lemma 4.5. For the inductive step, we will make use of a simplification of Lemma A.2, stated as Lemma 4.6 below. Concretely, suppose that Equation (4.3) holds for ρ i . Since we can apply Lemma 4.6 with This implies the bounds in Equation (4.3) after minor simplifications.
We now show the two missing statements in the proof of Lemma 4.4: Lemma 4.5 for the base case of the induction and Lemma 4.6 for the inductive step. We will show both of these statements in slightly more generality than is required for Lemma 4.4, with additional parameters that could be optimised to obtain tighter bounds in Lemma 4.4 for specific applications.
For the base case of the induction, we need to approximate the starting state ρ 0 . By assumption, this is a pure product state. Since all pure product states are related to each other by 1-local unitaries, it suffices to show the lemma for any one particular product state. For simplicity, we show it for ρ 0 = |+ +| ⊗n , which we have already considered in Example 2.5. There, we gave the following simple approximation: we defined the local Hamiltonian H 0 = 1 n n i=1 |+ +| i and noted that |+ +| ⊗n can be approximated by powers of this Hamiltonian. Here, we will require a better tradeoff between locality and approximation error. This will be achieved using the approximation to the powering function established in Lemma 2.9. Because we additionally need to bound the total local norm of the approximation, which becomes large when using the function from Lemma 2.9, we will combine the simple powering from Example 2.5 with the function from Lemma 2.9 to achieve an approximation that has both low locality and low total local norm. The requirement that our approximation should have small total local norm is also the reason why we cannot use Lemma 3.1 to approximate ρ 0 in this setting.
For the inductive step in Lemma 4.4, we need to analyse how the locality and approximation error change under application of one unitary U i . This is done in the following lemma, which is a simplification of Lemma A.2. If one wishes to obtain the tightest possible bounds in Lemma 4.4, one can use Lemma A.2 directly instead of this simplified statement. Proof. Let R be a k-local operator within -distance from ρ. We apply Lemma A.2 to R for the Hamiltonian H, which satisfies H S ≤Cn α |S| 1−α . Setting µ = 1 + 4/e, we get that e −ιH Re ιH is a (k ,˜ )-local operator with for κ ≥ k to be chosen later. Since e −ιH Re ιH is (k ,˜ )-local, it immediately follows from the triangle inequality that e −ιH ρe ιH is (k , )-local for =˜ + . We can now simplify the resulting bounds as follows choosing κ = C 1 1−α n 1−δ ≥ k: (i) Because l ≥ 1, n ≥ k C 1/α , and κ ≥ k, it is clear that 2 7Cn α κ 1−α + k ≤ 20 C n α κ 1−α . Inserting κ = C 1 1−α n 1−δ , we get the claimed bound on k .
(iii) To bound tln (ρ ), observe that since ρ is a quantum state, we are only interested in approximations where˜ ≤ 1 ≤ (2n) k tln(R), as otherwise the claim becomes trivial. Combining this and inserting tln(R) = tln (ρ) yields the claimed bound.

Concentration bounds for output states of dense Hamiltonian evolution
Combining Lemma 2.10 and Theorem 4.1, we immediately obtain the following concentration bound. One can of course also derive an analogous but tighter and more explicit statement from Lemma 4.4 instead of Theorem 4.1. While the above statement is about concentration with respect to Hamming weight, we can also prove concentration with respect to other observables. Let G = i G i be a classical local Hamiltonian (i.e. a local Hamiltonian that is diagonal in the computational basis) that satisfies the following condition analogous to Equation (4.1): for all S ⊂ [n], In this case, we can also prove a concentration bound on the expectation of ρ p with respect to G. More specifically, we can define a random variable E ρp indicating the "energy" of ρ p according to G, i.e. if we take the spectral decomposition G = g i Π i for orthogonal projectors Π i , then E ρp takes value g i with probability Tr[Π i ρ p ].

Example: random spin models
As an example of a family of dense Hamiltonians that satisfies Equation (4.1), we consider the pure q-spin model, which was also considered in [BGMZ22]. The pure q-spin model is a random COP with cost function where the coefficients J = (J i 1 ,...iq ) i 1 ,...iq∈ [n] are i.i.d. standard Gaussian random variables J i 1 ,...iq ∼ N (0, 1). Here, z i ∈ {±1} and the objective is to maximise C q (z 1 , . . . , z n ). This can be identified with a q-local Hamiltonian (4.5) We note the different normalisation factors: C q n (z) is normalised such that on average over J, max z C q n (z; J) = Θ(1). In contrast, H n has an additional factor of n, so that on average over J, H q n (J) = Θ(n). We use these different normalisations because the former is common in the classical literature (see e.g. [GJW20]), whereas the latter is common in the quantum literature.
The following lemma shows that this model satisfies Equation (4.1) with overwhelming probability, and as a result we can apply Corollary 4.7 and Corollary 4.8 to obtain concentration bounds. Because all terms of H q n,S are proportional to tensor products of Pauli-Z operators, it is easy to see that H q n,S = n · max z 1 ,...,zn∈{±1} C n,S (z; J). For any fixed choice of z 1 , . . . , z n ∈ {±1}, the random variable C n,S (z; J) is a sum of ≤ |S| n q−1 ≤ |S|n q−1 standard Gaussians with a normalisation factor 1 n (q+1)/2 , and is therefore distributed as N (0, /n q+1 ). By the standard upper deviation inequality for Gaussians, we have that Since we are interested in upper-bounding the probability that C q S (z; J) ≥ 6|S|/n for any choices of z 1 , . . . , z n and |S|, we can apply the union bound over the possible 2 n · 2 n ≤ e 2n choices of z 1 , . . . , z n and |S|. We therefore see that the probability that Equation (4.1) is violated (for any S ⊆ [n]) is at most e −3n · e 2n = e −n as claimed.
[BGMZ22] also consider a mixed q-spin model, which is a sum over pure j-spin models for j ≤ q. Specifically, the cost function can be written as where c j are arbitrary real coefficients and C j n is as defined in Equation (4.4). We can again associate a Hamiltonian H q,mixed n with this cost function. The following corollary follows immediately from Lemma 4.9 by the triangle inequality. We can use this property of the (mixed) random spin model to obtain concentration bounds for the output states of the QAOA applied to the COPs C q n (z; J) and C q,mixed n (z; J). This in turn can also be used to prove limitations on the success probability of the QAOA on these COPs. We spell this out in detail in Corollary 5.6.

Limitations on dense evolutions for constraint optimisation problems
Using our local operator approximations and concentration bounds for the output states of dense evolutions, we can show that such states have limitations as optimisers for COPs. We begin by introducing a structural property of random COPs, called the overlap gap property (OGP), that roughly says that good solutions to a COP must cluster, i.e. different good solutions must either be close to each other or far from each other. We then combine the OGP with our concentration results and the symmetry of the QAOA output to show that for most instances of random spin models, the QAOA can only produce a good solution with negligible probability.

Overlap gap property and existence of high-weight sets for local quantum optimisers
We consider an objective function C n (z) for z = (z 1 , . . . , z n ) ∈ {±1} n that we want to maximise. We begin by recalling a few general definitions from [GJW20], adapted to the case where algorithms for COPs output probability distributions or quantum states rather than a single element of {±1} n .
We can now show that for (approximately) local quantum states that optimise C n (z), the measurement distribution must be concentrated on one of these sets, which we will call the highweight set.
Proof. Because k = o(n), for sufficiently large n, we have k <ν 2 n. Therefore, by Lemma 2.11, it suffices to show that there exists an S i n for which Tr Π S i n ψ n > √ . This is because if such an S i n exists, we can consider the set S = G n \ S i n , which is at leastν 2 n-far from S i n . By Lemma 2.11 and the assumption Tr Π S i n ψ n > √ , this means that Tr[Π S ψ n ] ≤ √ . Therefore, Tr Π S i n ψ n = Tr[G n ψ n ] − Tr[S ψ n ] ≥ Tr[G n ψ n ] − √ . Now suppose for the sake of contradiction that for all i, Tr Π S i n ψ n ≤ √ . Since ψ n is assumed to be a (µ, 1 − 4 √ )-optimiser, i Tr Π S i n ψ n = Tr[G n ψ n ] ≥ 4 √ . Therefore, we can find two disjoint sets of indices I and I such that i∈I Tr Π S i n ψ n > √ and i∈I Tr Π S i n ψ n > √ . However, this contradicts Lemma 2.11 since ∪ i∈I S i n and ∪ i∈I S i n are separated by Hamming distance at least ν 2 n > k.
The claim that i is unique holds because if there were at least two such sets S i n and S i n , then Tr[Π Gn ψ n ] ≥ Tr Π S i n ψ n + Tr Π S i n ψ n ≥ 2 Tr[Π Gn ψ n ] − 2 √ , which is a contradiction since Tr[Π Gn ψ n ] ≥ 4 √ .
If q is even, both the COP C q n (z; J) and the QAOA output state ρ p,n are symmetric in the sense of Corollary 5.4. Therefore, if C q n (z; J) satisfies the (µ, ν 1 , ν 2 )-OGP (for some fixed value of J) and the associated Hamiltonian H q n (J) satisfies the norm constraint from Equation (4.1), we can combine Theorem 4.1 and Corollary 5.4 to obtain the following result, which shows that at level p = o(log log n), the probability that the QAOA will produce a "µ-good" string x decays with e −n 1/8 . Lemma 5.5. Fix q even. Suppose that the COP C q n (z; J) (for some choice of J) has the (µ, ν 1 , ν 2 )-OGP with 2ν 1 <ν 2 and the Hamiltonian H q n satisfies Equation (4.1). Then, no QAOA output state ρ p,n (J) (for any choice of γ i and β i within an arbitrary bounded range independent of n) can (µ, 1 − 8e −n 1/8 / √ 8 )-optimise C q n (z; J). In other words, if one measures the QAOA output state ρ p,n (J) with level p = o(log log n) in the computational basis, the probability of receiving a string x that satisfies C q n ((−1) x ; J) ≥ µ is at most e −Ω(n 1/8 ) .
We defineR i as the local operator approximations to e −ιH S i R i e ιH S i (as given by Lemma A.1) and R = iR i . (We use superscripts forR i because eachR i is itself a k -local operator, not an operator that acts non-trivially on only k qubits, i.e.R = iR i is not our usual local decomposition.) Since the sum of k -local operators is still k -local, we see thatR is k -local operator and, by the triangle inequality, approximates e −ιH Re ιH = i e −ιH S i R i e ιH S i to within error where the last inequality follows from Lemma 2.7. Due to unitary invariance of the norm: Finally, we combine the above bounds to obtain Equation (A.2): tln e −ιH Re ιH ≤ i tln(R i ) ≤ (2n) k tln(R) + .