Instance Independence of Single Layer Quantum Approximate Optimization Algorithm on Mixed-Spin Models at Infinite Size

This paper studies the application of the Quantum Approximate Optimization Algorithm (QAOA) to spin-glass models with random multi-body couplings in the limit of a large number of spins. We show that for such mixed-spin models the performance of depth $1$ QAOA is independent of the specific instance in the limit of infinite sized systems and we give an explicit formula for the expected performance. We also give explicit expressions for the higher moments of the expected energy, thereby proving that the expected performance of QAOA concentrates.


Summary
The Quantum Approximate Optimization Algorithm (qaoa) is a variational quantum algorithm designed to give approximate solutions to unconstrained binary optimization problems [1]. While qaoa can be proven to give the optimal answer in the limit where the number of qaoa layers p goes to infinity, rigorous results on the performance of qaoa with finite p are difficult to obtain. In a recent paper, Farhi et al. [2] studied the application of the qaoa to the Sherrington-Kirkpatrick (SK) model, a spin-glass model with random all-to-all two-body couplings, in the limit of a large number of spins. Their paper demonstrated that for fixed p, the performance of the qaoa is independent of the specific instance of the SK model and can be predicted by explicit formulas. The paper also showed that the approximation ratio of the qaoa at p = 11 outperforms a large class of classical optimization algorithms (although not the best classical algorithm [3]). In the current paper, we generalize the result of Farhi et al. to mixed-spin SK models, which extends the two-body couplings of standard SK to random all-to-all q-body couplings. We demonstrate that for p = 1, the performance of the qaoa is again independent of the specific instance, and we provide an explicit formula for the expected performance. Our work provides a potential avenue to demonstrating the advantage of qaoa over classical algorithms, as the best known classical algorithms for mixed-spin SK models have an approximation ratio that is bounded away from 1 [4,5].

Preliminaries and Notation
The Quantum Approximate Optimization Algorithm (qaoa) [1] is a heuristic quantum algorithm for binary optimization. Given a cost function of n binary variables (spins) H(z 1 , . . . , z n ), qaoa seeks to produce a string z := (z 1 , . . . , z n ) close to the minimum of H. Commonly we view H as a Hamiltonian operator that is diagonal in the Z-basis. A depth-p qaoa circuit then consists of p repetitions of alternatively applying the Hamiltonian H and the mixing Hamiltonian B = X 1 + · · · + X n to a uniform superposition as initial state, which is the product of +X single particle eigenstates. Explicitly, the depth-p qaoa state is given by |β 1 , . . . , β p ; γ 1 , . . . , γ p := e −iβpB e −iγpH · · · e −iβ1B e −iγ1H · 1 √ 2 n z∈{±1} n |z . (1) A depth-p qaoa circuit is parameterized by the 2p angles {γ i }, {β i }. For a given problem, these angles should be optimized so that measuring in the Z-basis gives strings that make H as small as possible. In practice, this is typically done by minimizing the expectation value of the energy For some problems, this minimization may be done analytically on a classical computer [1,[6][7][8]. Otherwise, the minimization can be performed by running the qaoa on a quantum computer repeatedly for a fixed set of angles, estimating the expectation value, and updating the angles according to classical minimization algorithms [1,9,10]. We note that minimizing the energy expectation value is only one possible definition of "best" angles; in general minimizing the expectation value and maximizing the probability of finding the optimal z (or maximizing the probability of H(z 1 , . . . , z n ) falling below a certain threshold) do not coincide. It was recently demonstrated in [11] that in local optimization problems with cost functions drawn from from realistic random distributions (e.g. MaxCut on random 3-regular graphs), the expectation value per spin is instance independent as n → ∞. That is, for fixed angles {γ i } and {β i } H/n is the same for all problem instances. This implies that the angles {γ i } and {β i } do not need to be optimized for every problem instance, but can be optimized once and reused for every problem drawn from the same distribution. The methods of [11] can also be used to derive concentration of measure results for local optimization problems. (While [11] did not explicitly address concentration of measure, it can be easily derived from their methodology.) That is, in the limit as n → ∞, the variance in the energy per spin goes to zero: Concentration of measure means that for large n, every measurement of a qaoa state in the Z-basis gives strings with the same energy per spin. In total, instance independence implies the qaoa angles do not need to be optimized from instance to instance in order to minimize the expectation value, and concentration of measure implies that expectation value is the correct measure of the "best" angles. While instance independence (and concentration of measure) were initially derived for local cost functions, similar results have also been derived for the Sherrington-Kirkpatrick (SK) model [12], a physics-inspired optimization problem with cost function with the J jk are independently drawn from a Gaussian distribution with mean 0. The SK model is the "most nonlocal" two-body cost function, and serves as a model for realistic nonlocal two-body optimization problems. In the limit n → ∞, the ground state energy per spin is known to be independent of the instance and can be computed explicitly [13,14].
Recently, Montanari derived a classical algorithm that produces strings z with energy within (1 − ) times the optimum; this method has a complexity of C( )n 2 , where C( ) is a polynomial in 1/ [3].
Ref. [2] proved both instance independence and concentration of measure for depth-p qaoa applied to the SK model. In addition, [2] provided an explicit formula for H/n in the limit n → ∞ for p = 1, and provided a computer algorithm to generate H/n for any fixed depth p > 1. Therefore, the qaoa angles for the SK model can be chosen on a classical computer, and there are fixed performance guarantees in the limit n → ∞. Ref. [2] demonstrated that at p = 11, qaoa outperforms semidefinite programming for the SK model, but could not show that the qaoa matches the performance of the Montanari algorithm.
In this work, we study a generalization of the SK model, the mixed-spin SK model, that allows for polynomials of degree d in the binary variables instead of only degree-2 terms [15,16]. This can serve as a model for nonlocal optimization problems with higher-order terms. The mixed-spin SK model is also known to have a ground state energy per spin that is independent of the instance and can be computed explicitly [15][16][17][18]. For a mixed-spin SK model with degree d = 3, the generalization of the Montanari algorithm [4] approaches a fixed approximation ratio of ∼ (0.9843 ± 0.0003) times the optimum value, rather than the optimal value [5]. Thus, the mixed-spin SK model is a potential avenue for establishing the advantage of the qaoa over classical algorithms.
In this paper, we generalize the work of [2] to prove instance independence and concentration of measure for p = 1 qaoa applied to mixed-spin SK models. As part of our work, we derive an explicit formula for H/n in the limit n → ∞, implying that the qaoa angles for the mixed-spin SK models can be chosen on a classical computer. Our work can likely be generalized to depth p ≥ 1 using the same methods as [2].

Our Results on Mixed-Spin Sherrington-Kirkpatrick Model
The mixed-spin SK model (often called the mixed p-spin model, although we will not use this terminology to avoid confusion with the p of qaoa) is given by [15,16] where in the last line we denote the product i∈S Z i by Z S . We assume each J S is sampled independently from a Gaussian distribution N (0, σ 2 |S| ) that only depends on |S|, and we will let σ q be the standard deviation of the coupling constants J S with |S| = q, for q = 1, . . . , d.
The ground state of the mixed-spin SK Hamiltonian is known to have a fixed energy per spin in the limit n → ∞ [18]. In fact we can also allow arbitrarily high orders in the mixed-spin SK model (d = ∞), provided the variances decrease quickly enough to make q 2 q σ 2 q finite, and the ground state model will still have a fixed energy per spin as n → ∞ [18]. However, for simplicity we consider some finite bound d on the degree.
Our main result is as follows. Define the n-spin model by Eq. 5 with J S ∼ N (0, σ 2 |S| ). Then, using depth 1 qaoa with angles β, γ, the expectation of the energy per spin equals in the large n limit is given by If we define the variables c q := σ q / √ q! and the polynomial ξ(x) := d q=1 c 2 q x q , as is common in discussions of the mixed-spin model (see Section 3.2), we can also write this as Furthermore, the second moment of the energy per spin equals Our second result, Eq. 8, allows us to prove that p = 1 qaoa applied to the mixed-spin SK model has both concentration of measure and instance independence. To see this, we note that we may write When applying qaoa to mixed-spin SK models, the measured value of (H/n) varies for two reason. First, it varies because the bonds J S vary from instance to instance (the E J [·] expectation). Second, it varies because the qaoa state |γ, β is not an eigenstate of the Z-operators, so that the measurement outcomes have randomness even for fixed J S (the · expectation). The left hand side of Eq. 9 represents the total variance in (H/n) due to both sources of randomness. The right hand side demonstrates that the total variance can be decomposed into two terms. The first term is the average over J S of the variance due only to the measurement randomness. The second term is the variance in the expected value H/n due to the randomness in the bonds J S . Since both of these terms are non-negative, they must both tend to zero as n → ∞ as well.
The fact that the first variance approaches zero gives us concentration of measure: it says that for typical couplings J S the variance in the measurement outcomes vanishes, and thus we always measure a string with energy per spin equal to the expectation value (note that the term inside the E [·] is always positive, so that for the average over J S to go to zero the magnitude of the typical value must also go to zero). The second term approaching zero clearly gives instance independence of the expectation value, since it shows that the variance in the expectation value due to different couplings vanishes.
Finally, we note that the methods we use also suffice to derive a formula for all higher moments of the energy per spin although unlike the m = 2 result, the m > 2 result does not have any obvious implication for the performance of the qaoa.

Numerical Results and Relation to Prior Results
As an example of using our Eqs. 6 and 7, we can compute the optimal angles for pure d-spin models given by σ q = δ d,q d!/2. In this case, our central equation reduces to Examples of this energy landscape for small d are plotted in Fig. 1, and numerically optimized angles and corresponding energy per spin are plotted in Fig. 2.
As another example application of our formula, we explore the instance independence at finite n to demonstrate convergence as n → ∞. In Fig. 3a-e, we plot the expected energy per spin H/n for five randomly generated mixedspin optimization problems generated from a distribution with (σ 1 , σ 2 , σ 3 ) = (1/3, 1/2, 1) (and all higher-order terms zero). We see that as we increase n, the energy landscape of any given problem instance quickly approaches the n → ∞ landscape (Fig. 3f), as predicted.
To put Eq. 6 in perspective we explicitly calculate what it implies for two basic cases q = 2 and q = 3 and compare it with the literature on mixed-spin models.
For the more general case q ≥ 2 we will briefly review the notation used by the articles [4,5,17,18]. In the current paper the mixed-spin model is described by the Hamiltonian  (f) The predicted energy landscape in the limit n → ∞. We see the energy landscape for random instances rapidly converges to the n → ∞ landscape, demonstrating instance independence in this limit.
with J j1···jq ∼ N (0, σ 2 q ) and for a fixed q the summation has n q terms. In [5] and elsewhere a different notation is used where the Hamiltonian is defined by with J j1···jq ∼ N (0, 1). Crucially, now the indices j 1 , . . . , j q can have repeated values and permutations are treated distinctly, hence for each q the summation involves n q terms.
In the limit of large n one can show that the q-plets (j 1 , . . . j q ) with repeated values can be ignored as their relative contribution decreases as a function n. For distinct indices there are q! permutations to consider in H , hence we have a sum of q! standard normal distributions: J j1...jq + · · · + J jq···j1 , which is identical to a single normal distribution with variance q!. As a result we re-express H as with J ∼ N (0, 1). Thus when σ q = c q √ q! we have that H and H describe the same model. It is common to capture the different c q coefficients by the mixture function ξ(x) = q c 2 q x q , such that the standard SK model has ξ(x) = x 2 /2 (i.e. c 2 = 1/ √ 2 and hence σ 2 = 1).
It was shown that the expected ground state energy-per-spin of this model equals −0.8132 ± 0.0001 In the notation of the current article this model is equivalent to σ 3 = √ 3 and with this our Eq. 6 gives the expectation This expectation is minimized to E J [ H 3 /n ] ≈ −0.270638 by the angles (β * , γ * ) ≈ (±0.290003, ∓0.430091), which are the solutions to This is the same result we found numerically for d = 3 in Fig. 2. We thus see how depth p = 1 qaoa can approximate the ground state energy by a factor of 0.332806. This 0.33 approximation factor had been reported earlier by Zhou et al. [19] 4

Derivation of Main Result
The proof in this section follows to large extent the framework of the earlier result by Farhi et al. [2], which relied on manipulating the moment generating function E J e iλH/n to extract expressions for the first and second moments of (H/n). We use their method of simplifying the moment generating function, and their reorganization of the sum over z-strings into a sum over sketches (see Section 4.2). We extend their proof technique by generalizing their form of the moment generating function to higher-spin models and demonstrating that it can still be written as a sum over sketches, developing careful power-counting methods to allow us to extract the relevant terms in the n → ∞ limit, and deriving identities that allow us to explicitly evaluate the relevant sums. In our proof, we will use the following conventions: • A Z-basis state |z is specified by a string z = (z 1 , z 2 , . . . , z n ) of ±1s. We will use the shorthand z ∈ {±1} n for this.
• For a set S ⊆ {1, . . . , n}, we denote the product of the bits in S as i∈S z i =: z S ; with this convention we thus have Z S |z = z S |z .
• The uniform superposition over all strings z is denoted by This is in contrast to the usual convention in quantum information, in which a Z-basis state is specified by a string z of 0s and 1s and the XOR of two strings is given by componentwise addition modulo 2. We choose our notation to be consistent with [2] and to simplify certain expressions in our derivation. We can simplify the expectation inside the moment-generating function by inserting three complete sets of states:

Moment Generating Function
where in the last step we used that z|e iβB |z = zz |e iβB |(+1) n and made the replacements z → zz and z → z z .

Expectation when Couplings are Normal Distributed Variables
We will now treat the J S couplings as a random variable and consider the expectation E J of the energy. We assume that the distribution is symmetric with respect to J ↔ −J, such that we can replace z S J S by J S to get When we further assume that the J S variables are independent between different S we can continue by Next we assume that the J S are normally distributed with a standard deviation that is the same for all sets S of the same size, that is J S ∼ N (0, σ 2 |S| ). We note that taking the expectation value of a Gaussian random variable J with standard deviation σ gives so that our overall expression becomes To do the sum over z, z , we claim that the summand does not depend on all 2n spin values of z and z . Instead, it is only a function of the four integer values (n ++ , n +− , n −+ , n −− ), where n ss is defined to be the number of positions k ∈ {1, . . . , n} with (z k , z k ) = (s, s ). Note that only three of these variables are actually independent, as we always have n ++ + n +− + n −+ + n −− = n. As these numbers summarize the crucial information of the strings, we will refer to (n ++ , n +− , n −+ , n −− ) as the sketch of (z, z ). Writing the summand in terms of the sketch rather than (z, z ) was introduced in [2]; here we establish that we can still write the summand in terms of the sketch for the mixed-spin SK model. To start, it is straightforward to verify that We can also write explicit combinatorial formulas for the sums in the exponential: Therefore, the summand indeed depends only on (n ++ , n +− , n −+ , n −− ) and the number of ways to assign the n positions into four groups of these sizes is the multinomial n!/(n ++ !n +− !n −+ !n −− !). To condense our notation we will use {n * } to denote the set of sketches n * = (n ++ , n +− , n −+ , n −− ), allowing us to use the shorthand Note that this summation has n+3 3 terms. We thus have Eq. 31 is the form of the moment-generating function we will use to evaluate E J [ H/n ] and E J [ (H/n) 2 ].

General Form of Moments
Using Eq. 21 combined with the form of the moment-generating function given in Eq. 31, we can write the first moment as: and the second moment as: The explicit expression for f q (Eq. 28) shows that f q is a degree-q polynomial in the variables (n +− − n −+ ), (n ++ − n −− ), and n, so we can expand f q as where the f abc q are constants independent of n. In terms of this expansion, we have (36) Ref. [2] could explicitly evaluate these terms for the small values of a and b relevant for the two-body SK model, using concise expressions for f 2 and g 2 . However, to get explicit formulas beyond q = 2 requires carefully counting powers of n to establish which terms survive in the n → ∞ limit, and using the general expressions for f q and g q (Eqs. 28 and 29) to derive explicit forms of the leading-order terms. To tame this sum our derivation thus uses techniques that go beyond a simple generalization of [2].
For B b t , we use The highest-order terms in n come from the terms k+ =b b k x k (−y) ∂ k x ∂ y in (x∂ x − y∂ y ) b and thus Plugging our Eqs. 42 and 44 for A a t and B b t into Eq. 38 for T ab ξ hence gives As n a = n a /a! + O(n a−1 ) and similarly n t = O(n t ) this simplifies to After reminding ourselves that a + b ≤ ξ we see that, in the n → ∞ limit, T ab ξ further condenses to Asḡ q (a) has an implicit n dependency we have to determine its relevant terms as n → ∞. Eq. 47 shows that the only relevant terms inḡ q are those of degree at least (q − 1) in n. From the definition of g q (Eq. 29), elementary algebra givesḡ Therefore, we have the large n limit

Large n properties of f q
To complete the evaluation of the moments, we to determine the large n dependency of f q , its f abc q coefficients (Eq. 34) and their role in the first and second moment expressions of Eqs. 35 and 36. We see from Eq. 49 that the only coefficients that matter in the n → ∞ limit are those with a + b = ξ, i.e. those with a + b + c = q. We can evaluate these f abc q from the explicit formula for f q (Eq. 28) by keeping only terms of total degree q in the terms (n +− − n −+ ), (n ++ − n −− ), and n. This gives if a + b + c = q with a even or c > 0.
Having established these properties of T ab q−c and f q , we now have sufficient information to evaluate our moments.

Evaluating the First Moment
To evaluate the first moment, we simplify Eq. 35 using the definition of T ab ξ (Eq. 37) and the results of Eqs. 49 and 51 to get Thus, we find that for all m,

Alternative expression for E [ H/n ]
Here, we present an alternative expression for E [ H/n ] that makes contact with the notation used in [5] and elsewhere (see Section 3.2). The current paper's central result (Eq. 6) reads: If we use the substitution from Section 3.2, σ 2 q = c 2 q q!, this becomes Using the identity we can rewrite this as

Discussion and Conclusion
In this work, we have derived explicit formulas to quantify the performance of p = 1 qaoa on mixed-spin models in the n → ∞ limit. We demonstrated both concentration of measure and instance independence for arbitrary mixed-spin models, which imply that the expectation value of the energy per spin is independent of the specific model specification and that measurements of the qaoa state are guaranteed to give energies close to the expectation value. Our explicit formula for the expectation value of the energy for arbitrary mixed-spin models allows us to find the optimal angles on a classical computer. There are two obvious open questions raised by this work. First, the approach of this paper can probably be combined with the methods of [2] to generalize our work to depth p > 1 qaoa. Such a result would allow one to prove instance independence and concentration of measure, and derive a computer algorithm to generate formulas for the expectation value per spin, at arbitrary depth p. This is a particularly interesting route of research, since it is known that in the cubic case with σ q ∝ δ q,3 , Montanari's classical algorithm does not approach the optimal solution [5], so that at sufficient depth p the qaoa has a chance of outperforming the best known classical algorithm. Higher-spin models with q > 3 may even provide a more direct route towards finding a setting where qaoa at depth p = 1 outperforms classical optimization such as Montanari's algorithm [4,5]. While the generalization to higher p is likely possible, it is a nontrivial extension of this paper, and we leave it for future work.
Second, it remains an open question what to what extent results on the random models can be used to find optimal angles for realistic binary optimization problems. One hypothetical approach to finding qaoa angles for a single instance of an n-spin optimization problem would be the procedure: