Hybrid quantum-classical algorithms for approximate graph coloring

We show how to apply the recursive quantum approximate optimization algorithm (RQAOA) to MAX-$k$-CUT, the problem of finding an approximate $k$-vertex coloring of a graph. We compare this proposal to the best known classical and hybrid classical-quantum algorithms. First, we show that the standard (non-recursive) QAOA fails to solve this optimization problem for most regular bipartite graphs at any constant level $p$: the approximation ratio achieved by QAOA is hardly better than assigning colors to vertices at random. Second, we construct an efficient classical simulation algorithm which simulates level-$1$ QAOA and level-$1$ RQAOA for arbitrary graphs. In particular, these hybrid algorithms give rise to efficient classical algorithms, and no benefit arising from the use of quantum mechanics is to be expected. Nevertheless, they provide a suitable testbed for assessing the potential benefit of hybrid algorithm: We use the simulation algorithm to perform large-scale simulation of level-$1$ QAOA and RQAOA with up to $300$ qutrits applied to ensembles of randomly generated $3$-colorable constant-degree graphs. We find that level-$1$ RQAOA is surprisingly competitive: for the ensembles considered, its approximation ratios are often higher than those achieved by the best known generic classical algorithm based on rounding an SDP relaxation. This suggests the intriguing possibility that higher-level RQAOA may be a potentially useful algorithm for NISQ devices.


Introduction
Combinatorial optimization is currently considered one of the most promising areas of application of near-term quantum devices. One of the primary restrictions of such devices is the fact that they may only execute short-depth circuits with reasonable fidelity. This is a consequence of the lack of sophisticated fault-tolerance mechanisms. Variational quantum algorithms such as the quantum approximate optimization algorithm (QAOA) [7] can deal with this restriction because parameters such as the circuit depth/architecture can be chosen according to existing experimental restrictions. This is in contrast to more involved quantum algorithms e.g., for algebraic problems: these may involve circuit depths scaling with the problem size and have additional requirements on the connectivity of available inter-qubit operations.
QAOA and similar proposals are hybrid algorithms: Here we envision operations on the quantum device to be supplemented by efficient, that is, polynomialtime classical processing. As a result, the algorithmic capabilities of such a hybrid setup should be compared to the class of efficient (polynomial-time) classical probabilistic algorithms. This means that corresponding proposals face stiff competition against decades of algorithms research in classical computer science. While there are complexity-theoretic arguments underscoring the power of e.g., constant-depth quantum circuits, the jury is still out on whether hybrid algorithms for near-term devices can indeed provide a provable computational advantage over comparable classical algorithms.
Recent no-go results show limitations of variational quantum algorithms for the well-studied MAX-CUT problem: in [3], we showed that the Goemans-Williamsonalgorithm -the best known classical algorithm for this problem -outperforms QAOA for any constant level p (which amounts to constant-depth for boundeddegree graphs) in terms of the achieved approximation ratio. This result extends to more general, possibly non-uniform Z 2 -symmetric hybrid local quantum algorithms, and shows that corresponding circuits will need a circuit depth growing at least logarithmically with the problem size to yield a better approximatio ratio. More recently, Farhi, Gamarnik and Gutmann [6] exploited the spatial uniformity of the QAOA algorithm to give an extremely elegant argument demonstrating a similar logarithmic-depth lower bound for QAOA for random d-regular graphs. The title of reference [6] aptly summarizes this conclusion using the words "the QAOA needs to see the whole graph".
To go beyond these negative results, one could attempt to simply use logarithmicdepth circuits with the reasoning that for small to intermediate problem sizes, the corresponding circuit depths may still be amenable to realization by a near-term device. In addition to the problem of fault-tolerance, the potential merits of this idea are unfortunately difficult to assess by means of classical simulation. Indeed, the very idea that this approach yields computational benefits over classical algorithms is in tension with efficient classical simulability of corresponding quantum processes.
An alternative to this necessarily limited approach is to try to find new ways of leveraging the information-processing capabilities of short-depth circuits by introduction of non-local (classical) pre-and post-processing steps, thereby sidestepping the key assumption of locality in the aforementioned no-go results. As long as one restricts to procedures where these additional processing steps can be executed efficiently (e.g., in polynomial time) and the quantum device is used only a polynomial number of times, the resulting hybrid algorithm still has the feature of being efficiently executable with near-term hardware. At the same time, it may provide additional descriptive power, ultimately (hopefully) resulting in improved approxi-mation ratios.
The recursive quantum approximate optimization algorithm (RQAOA) is a proposal of this kind which makes use of QAOA (with a constant level p) in a recursive fashion with the goal of improving approximation ratios. Numerical evidence obtained for MAX-CUT on random graphs indicates that even level-1 RQAOA significantly improves upon QAOA (at the same level) for the MAX-CUT problem [3]. While the problem of establishing a rigorous lower bound on the approximation ratio achieved by this hybrid algorithm remains open (except for very special families of instances), this indicates that RQAOA -especially at higher levels p -has the potential to yield results competitive with the best known classical algorithms.
Here we consider the MAX-k-CUT problem, an optimization version of the graph k-coloring problem. Suppose G = (V, E) is a graph with n = |V | vertices and e = |E| edges. Given an integer k ≥ 2, the goal is to find an approximate k-coloring of vertices of G which maximizes the number of edges whose endpoints have different colors. For each vertex j ∈ V , introduce a variable x j ∈ Z k which represents a color assigned to j. The k-coloring cost function to be maximized is defined as The performance of a k-coloring algorithm on a given graph G is usually quantified by its approximation ratio α, that is, the ratio between the expected value of the cost function C(x) on a coloring x produced by the algorithm and the maximum value max x C(x). The MAX-k-CUT problem can also be viewed as an anti-ferromagnetic k-state Potts model. The standard MAX-CUT problem corresponds to k = 2. The problem is well-studied. Consider first the special case where G is a k-colorable graph. Clearly, a uniformly random assignment of colors x achieves an approximation ratio of 1 − 1/k on average. For the case where k is a power of two, Cho, Raje and Sarrafzadeh [11] constructed an O((e + n) log k)-time algorithm which achieves an approximation ratio of 1 − 1/k(1 − 1/n) log k , improving upon a deterministic O(enk)-time algorithm [21] achieving the same ratio 1 − 1/k as random coloring (and obtained by derandomizing the latter). In the general case, when G may not be k-colorable, Frieze and Jerrum [8] gave an algorithm achieving an approximation ratio 1 − 1/k + 2 log k/k 2 for arbitrary sufficiently large k. This is known to be close to optimal since no polynomial-time algorithm can achieve an approximation ratio better than 1 − 1/(34k) unless P = N P [12], and is indeed optimal if one assumes the Unique Games Conjecture [13]. Frieze and Jerrum's algorithm is based on an SDP relaxation and a randomized rounding scheme inspired by Goemans and Williamson's algorithm for MAX-CUT [10] and comes with detailed estimates of the approximation ratio α k achieved by the algorithm, namely Here the bound on α 2 matches the Goemans-Williamson algorithm for MAX-CUT. This was further improved by a new algorithm by Klerk et al. [14] with a guaranteed approximation ratio of Their algorithm is based on the properties of the Lovasz ϑ-function and achieves the best currently known approximation ratio (for polynomial-time algorithms) for k = 3. Goemans and Williamson themselves [9] independently introduced an algorithm based on so-called "complex semidefinite programming" that matches this ratio for k = 3. Unfortunately, the analysis involved in establishing the bound (2) is significantly more complicated than the MAX-CUT case and is not known to be generalizable to arbitrary k. Fortunately, Newman [17] describes a simple rounding procedure that leads to an algorithm provably matching the bound (2) while only being slightly worse than Frieze/Jerrum for larger values of k. Finally, it should also be noted that dense 3-colorable graphs can be 3-colored in (randomized) polynomial time [1].
Here we study approximation ratios achieved by QAOA and RQAOA, respectively for MAX-k-CUT with k > 2, and compare these to those achieved by the best known classical approximation algorithms for this problem discussed above.
Outline In Section 2, we discuss how to formulate QAOA and RQAOA for the MAX-k-CUT-problem. In Section 3, we establish limitations on constant-level QAOA. In Section 4, we describe an efficient classical simulation algorithm for simulating level-1 QAOA and level-1 RQAOA. In Section 5 , we discuss our numerical findings comparing level-1 QAOA and level-1 RQAOA with the best known classical algorithm for MAX-k-CUT.

Hybrid algorithms for MAX-k-CUT
Here we give a brief description of how QAOA and RQAOA can be adapted to apply to MAX-k-CUT for any k ≥ 2.

QAOA to MAX-k-CUT
Solving MAX-k-CUT by level-p QAOA (in the following denoted by QAOA p ) proceeds in an established fashion as in the case of MAX-CUT (i.e., k = 2). One notable feature is that in the MAX-k-CUT problem, it is natural to work with n qudits of dimension k each instead of qubits (all quantum circuits considered below can be simulated on a standard qubit-based quantum computer with a constant-factor overhead). We use the n-qudit cost function Hamiltonian where J i,j (b) are real coefficients and is a diagonal projector acting on C k ⊗ C k . Here and below the addition of color indices is performed modulo k. The subscripts i, j in Π i,j (b) indicate which pair of qudits is acted upon by Π(b). By definition, Equation (3) defines a general class of cost function Hamiltonians. The MAXk-CUT problem on a graph G = (V, E) associated with the cost function (1) corresponds to the choice of coefficients The Hamiltonian C defined Eq. (3) commutes with the symmetry operator X ⊗n , where is the generalized Pauli-X operator. This is analogous to the Z 2 -symmetry of the Ising model exploited in [3]. Motivated by the corresponding ansatz for MAX-CUT, we generalize the level-p QAOA ansatz to qudits as where the level-p QAOA unitary is given by with β = (β (1) , . . . , β (p) ) ∈ (R k ) p , γ = (γ (1) , . . . , γ (p) ) ∈ R p and where for β ∈ R k , the unitary B(β) : C k → C k is diagonal in the eigenbasis of X and given by In this expression, is the generalized Pauli-Z operator, whereas |+ ∈ C k is the +1 eigenvector of X, that is, (ii) then measuring an energy-maximizing state |ψ * = |ψ(β * , γ * ) in the computational basis.
The output of this process is a coloring x ∈ Z n k achieving an expected approximation ratio This concludes the description of QAOA p .

Adapting RQAOA to MAX-k-CUT
The RQAOA proceeds by successively reducing the size of the problem, eliminating a single variable in each step by a procedure called correlation rounding. To formulate RQAOA for MAX-k-CUT, we begin by noting that the family of Hamiltonians defined by (3) is closed under variable eliminations (up to irrelevant additive constants).
Level-p RQAOA proceeds by iterative applications of several single variable elimination steps as described by (i) and (ii) described below. We note that in the very first variable elimination step, the scalar M i,j (b) (for a fixed pair of vertices (i, j)) computed in (i) depends only on whether or not b is non-zero. This is a consequence of the specific form (4) of the coefficients J i,j (b) in the (initial) MAX-k-CUT cost function Hamiltonian C (Eq. (3)). However, this is no longer necessarily the case after one or more variable elimination steps have been completed since the cost function Hamiltonian is updated according to (ii).
(ii) Next, find a pair of vertices (i, j) and a color b ∈ Z k with the largest magnitude of M i,j (b) (breaking ties arbitrarily). Then impose the constraint restricting the search space to the span of computational basis vectors |x associated with colorings x ∈ (Z k ) n satisfying (7). Observe that |ψ has support on such basis states if and only if M i,j (b) = 1. To eliminate the variable x i , the constraint (7) is inserted into the cost function Hamiltonian as follows: use the identity which holds for all h ∈ {i, j} and all a ∈ Z k . Thus in the cost function Hamiltonian for all h ∈ {i, j} one gets a new Hamiltonian C of the form (3) (up to an additive constant) acting on n − 1 variables, i.e., C acts trivially on the i-th qudit. The cost function C is defined on a graph G with n − 1 vertices obtained from G by identifying the vertices i and j.
By construction, the maximum energy of C coincides with the maximum energy of C over the subset of assignments satisfying the constraints (7). Since the new Hamiltonian C acts trivially on the qudit i, this qudit can be removed from the simulation. This completes the variable elimination step. RQAOA p executes several variable elimination steps in succession, eliminating one variable in each recursion until the number of variables reaches a predefined cutoff value n c . The remaining Hamiltonian then only depends on n c variables is minimized by a purely classical algorithm, for example brute-force search. An (approximate) solution x ∈ Z n k of the original problem can then be obtained recursively by reconstructing eliminated variables using the constraints (7).
Since the introduction of RQAOA in [3], other modifications of QAOA involving iterated rounding procedures have been proposed, see e.g., [16, Section V.A]. In contrast to the variant discussed there, RQAOA's rounding procedure is deterministic and relies on rounding correlations between qudits rather than individual spin polarizations.
Further variations of RQAOA have been proposed and studied in [5,20]. There, the authors consider "warm-starting" the algorithm by beginning with a solution returned by an efficient classical algorithm (for the case of MAX-CUT, the Goemans-Williamson algorithm) instead of the standard product state. They provided numerical evidence that both QAOA and RQAOA can achieve better performance when supplemented with "warm-starting", making this a promising potential avenue for future investigations.

Limitations of QAOA p applied to MAX-k-CUT
Limitations for level-p QAOA with p = O(log n) applied to MAX-CUT were obtained for ensembles of random d-regular graphs in [6]. Here we show analogous results for QAOA applied to MAX-k-CUT for k > 2. Our main result is the following theorem. It shows that unless the level p of QAOA grows at least logarithmically with n, QAOA cannot be more than marginally better (for large n and d) than randomly guessing a coloring, for most d-regular bipartite graphs. We denote by G bi n,d the ensemble of uniformly random d-regular bipartite graphs on n vertices. For a graph G, we denote by an optimal set of angles, and where C(G) is the cost function Hamiltonian for G defined by Eq. (3).
and all levels p < 1 2 log d n. Our approach generally follows the basic idea of [6] and also exploits the fact that QAOA is a local, and furthermore uniform algorithm. In contrast to the main result of [6], which is expressed as an upper bound on the expected approximation ratio (over the choice of graph) achieved by QAOA, Theorem 3.1 shows that the approximation ratio is upper bounded for almost all graphs in the considered ensemble. Analogous to the setting of [6], where it is shown that the approximation ratio converges to 1/2 in the limit of large d, we show that the limiting value is the approximation ratio achieved by random guessing. In the analysis of [6], the explicitly known expected maximal cut size for random d-regular graphs is used. The analogous value for MAX-k-CUT is not known for random d-regular graphs. Instead, we rely on an upper bound on the size of the maximum k-cut for typical d-regular random graphs obtained in the analysis of an SDP-relaxation from [4]. In our analysis, we pay special attention to the graph-dependence of the optimal QAOA angles when computing QAOA approximation ratios: here we make use of the fact that the figure of merit (the average energy) for different typical graphs differs only by a negligible amount for any fixed angles.
Given a graph G = (V, E), let (i, j) = e ∈ E be an edge in G. We write N p (e) to denote the p-local neighborhood of e, i.e., the subgraph induced by the set of all vertices h such that d(h, i) ≤ p or d(h, j) ≤ p. Let T p,d denote the tree such that all vertices except the leaves are d-regular and such that there exists an edge (i, j) such that all leaves are distance p from either i or j. Let n T p,d (G) denote the number of edges e in a graph G such that N p (e) ∼ = T p,d . The key insight of [6] is the fact that for p sufficiently small compared to n, it holds with high probability that almost all p-local neighborhoods of a random d-regular graph look like T p,d . This is expressed by the following lemma, where we denote by G n,d the uniform distribution over d-regular graphs on n vertices. Lemma 3.2. [6] Let n, d and A ∈ (0, 1) be given. Suppose that Then there exists some constant A ∈ (A, 1) such that In particular, by Markov's inequality, Moreover, the same result also holds for the ensemble G bi n,d of random d-regular bipartite graphs.
The proof of Lemma 3.2 relies on the fact that the number of cycles of length in a random d-regular graph is Poisson distributed, allowing an upper bound to be established on the expected number of cycles below a certain length [23]. This latter bound applies to both random d-regular graphs as well as random bipartite d-regular graphs. Both ensembles will be of relevance in the proof of Theorem 3.1.
Let us now fix some B ∈ (A , 1). We will say that a graph G on n vertices is T -typical if n T p,d (G) < n B . Lemma 3.2 says that a random (possibly bipartite) d-regular graph G is T -typical with high probability. Since QAOA is a local, and moreover uniform, algorithm, the performance of QAOA p on any given graph can be related to the performance of QAOA p on the induced p-neighborhood of each edge. Lemma 3.2 therefore suggests that to study QAOA p for generic d-regular graphs, it suffices to consider the behavior of QAOA p on T p,d .
Let now C(G) be the cost function Hamiltonian for the MAX-k-CUT problem associated with G (see Eq. (3)), and let ψ(θ) denote the level-p QAOA state defined by Eq. (5), where we use the shorthand θ = (β, γ) for the collection of all angles. Then the expectation value for MAX-k-CUT QAOA p on G = (V, E) can be written as where C i,j is the term in the Hamiltonian (3) acting non-trivially on both qudits i and j. Note that due to the locality of the QAOA unitary, each of the individual terms ψ(θ)|C i,j |ψ(θ) in this expression a function of the subgraph N p (ij) and θ. Now, let G be a d-regular T -typical graph. Each of the local terms ψ(θ)|C i,j |ψ(θ) is bounded above by 1 since C i,j is a projection. Since for a T -typical graph G, all but n T p,d (G) < n B of the edges satisfy N p (e) ∼ = T p,d , it follows that In this expression, the function C T (θ) = ψ(θ)|C i ,j |ψ(θ) is the local term associated with any one edge (i , j ) such that N p (i j ) ∼ = T p,d . Importantly, the function C T (·) depends on (p, d) only, i.e., it is universal for all T -typical graphs G.
An important consequence is that the maximal expectation value achieved by level-p QAOA is roughly identical for all T -typical graphs G: if G 1 and G 2 are T -typical, then This follows from the fact that for any set of angles θ (12) as a consequence of (10), and the inequality f ∞ − g ∞ ≤ f −g ∞ . Specializing Eq. (12) to θ = arg max θ ψ(θ)|C(G 1 )|ψ(θ) also shows that a choice of optimal angles for any given T -typical graph G 1 will also be nearly optimal for all other T -typical graphs G 2 . We are going to establish an upper bound on the performance of QAOA on Ttypical graphs. To this end, we use the following upper bound on the typical size of the maximum k-cut of a graph G in the ensemble G n,d of d-regular graphs. Then the following holds: Theorem 19] There exists constants λ, ζ > 0 such that Proof. Let T denote the set of T -typical graphs, and let us abbreviate . We show that a randomly chosen d-regular graph G ∼ G n,d satisfies the desired properties (i.e., G ∈ T and Eq. (14)) with non-zero probability. Indeed, we have by the union bound We show that sum on the rhs is strictly less than 1 for a suitable choice of B ∈ (0, 1) in the limit n → ∞, implying the claim. Indeed, by Eq. (9) of Lemma 3.2, there is some A ∈ (A, 1) such that On the other hand, we have with Markov's inequality by Taylor series expansion. Note that the latter is well-defined, i.e., n B /E → 0 for n → ∞ for all B ∈ (0, 1), since we have E = Ω(n). This is because a max k-cut is at least as large as a max 2-cut, and the expected max 2-cut scales as Θ(n). Inserting (13), we conclude that Combining Eqs. (15), (16), (17) and comparing exponents of n, we conclude that The Lemma therefore holds for any choice of B ∈ (1 − B + A , 1).
The first inequality follows from the fact that the expected value of the MAX-k-CUT cost function for colorings returned by QAOA cannot exceed the actual maximum. From inequality (13), we then have the bound From the uniform bound Eq. (11) between the maxima of T -typical graphs, the same bound holds for any T -typical graph G with the same n and d as G * , that is To obtain a bound on the approximation ratio, we can focus our attention on d-regular bipartite graphs, which are guaranteed to have a maximum k-cut of size nd/2 for any k. Since Lemma 3.2 applies equally to random bipartite graphs, so that a random bipartite d-regular graph is also T -typical with probability 1 − o(1). Letting G be a T -typical bipartite graph with n and d satisfying the hypothesis of Theorem 3.3, the bound (18) holds for G. Dividing through by nd/2, the approximation ratio α MC k p (G) is therefore bounded above by This concludes the proof of Theorem 3.1.

Classical simulation of level-1 RQAOA
A polynomial-time classical algorithm for computing expectation values ψ(β, γ)|Z j Z k |ψ(β, γ) for level-1 QAOA states |ψ(β, γ) and the MAX-CUT cost function was given by Wang et al. [22]. Our work [3] describes a generalized version of this algorithm applicable to any Ising-type cost function.
In this section, we consider the k-coloring cost function Hamiltonian (3) and more generally Hamiltonians of the form (3) and give a classical algorithm for computing expectation values ψ|Z r u Z s v |ψ , where |ψ = |ψ(β, γ) is the level-1 QAOA state defined in Eq. (5) and Z u is the generalized Pauli-Z operator acting on a qudit u ∈ [n]. The algorithm has runtime O(k 5 (d u + d v )), where d j is the degree of a vertex j. For a constant number of colors k, this scales at most linearly with n. For constant-degree graphs, the computation requires a constant amount of time.
A natural application of our algorithm is finding optimal angles (β, γ) for level-1 QAOA states. Indeed, since the variational energy is a linear combination of the expected values ψ|Z r u Z s v |ψ , it can be efficiently computed classically using our algorithm. This eliminates the need to prepare the variational state |ψ on a quantum device for each intermediate choice of the angles (β, γ) in the energy optimization subroutine. In fact, the technique discussed here can easily be adapted to also efficiently compute partial derivatives ∂ ∂γ ψ|Z r u Z s v |ψ and ∂ ∂β j ψ|Z r u Z s v |ψ , thus sidestepping the need of using a quantum device even if the latter is used to estimate the gradient of the cost function. However, once the optimal angles (β, γ) are found, a quantum device would be needed to prepare the optimal variational state and perform a measurement to obtain a classical solution of the problem. Since the final measurement is not used in the recursive version of QAOA, our algorithm enables efficient classical simulation of level-1 RQAOA in its entirety.

General algorithm for level-1 QAOA-type expectation values
Our formulation extends beyond our QAOA Ansatz for MAX-k-CUT and could be used in other contexts. Specifically, let C be a 2-local Hamiltonian on n kdimensional qudits given by where we assume that the terms C u,v and C u ,v acting on different qudits u < v and u < v commute pairwise. We will also write C v,u ≡ SWAPC u,v SWAP † for the interaction term C u,v acting on qudits (v, u) (in this order) when u < v. Let |+ ∈ C k be a qudit state. We are interested in generalized QAOA states of the form |Ψ(β, γ) = B(β) ⊗n e iγC |+ ⊗n (19) where B(β) : C k → C k is a unitary on C k for any β ∈ R k . We assume that B(−β) = B(β) † is the adjoint of B(β). The states defined by Eq. (5) are a special case.
Let O be any two-qudit observable. Our goal is to compute the expectation value where the subscripts u, v indicate the pair of qudits acted upon by O. Define the density matrix One can compute ρ u,v in time O(n) by initializing the pair of qudits u, v in the state e iγHu,v |+ ⊗2 and sequentially coupling {u, v} to each of the remaining qudits w in the following way: qubit w is initialized in the state |+ and then coupled to {u, v} by the respective terms in the cost function C, namely e iγ (Cu,w+Cv,w) . Finally, w is traced out. The algorithm for computing ρ u,v can be summarized as follows: where E w is a two-qudit quantum channel defined as E w (η) = Tr 2 e iγ(Cu,w+Cv,w) (η ⊗ |+ +|)e −iγ (Cu,v+Cv,w) ) .
The final state ρ u,v involves n − 2 applications of E w in general. By restriction to w which interact non-trivially with {u, v}, this can be reduced to fewer than d u + d v applications, where d u is the degree of u in the interaction graph associated with C (i.e., (u, w) is an edge if and only if C u,w = 0). This improvement applies for example to the MAX-k-CUT cost function Hamiltonian (3) for a bounded-degree graph. Finally, the definition (19) of |ψ(β, γ) together with the assumed commutativity of the terms C u,v in the Hamiltonian give that the expectation value (20) can be computed according to This computation is illustrated in Fig. 1. To find the k-dependence of the overall complexity of this algorithm, we need to consider the evaluation of the superoperators E w defined by (21) and the expression (22) for the problem at hand.

Simulating QAOA 1 for MAX-k-CUT
Here we specialize the algorithm from Section 4.1 to the MAX-k-CUT problem. For this problem, B(β) is given by Eq. (6) whereas C (cf. Eq. (3)) is a diagonal 2-local Hamiltonian commuting with X ⊗n . The most general diagonal 2-local Hamiltonian C commuting with X ⊗n has the following form: Let {h(a)} a∈Z k be a family of n × n-matrices satisfying for all u, v ∈ [n] and r ∈ Z k . Then C can be written as  Note that where the Fourier transformĥ(b) of h is defined as a hermitian n × n matrix with entriesĥ A straightforward computation using (25) then gives that the superoperator E w (cf. Eq. (21)) is equal to

Comparison of QAOA, RQAOA and classical algorithms
With the algorithm proposed in Section 4, we have simulated level-1 QAOA and level-1-RQAOA on large problem instances of MAX-k-CUT. Our focus is on (approximate) 3-colorability, i.e., k = 3. To compare these hybrid algorithms to classical algorithms, we have also applied the best known efficient classical approximation algorithm (as measured by the approximation ratio) to each problem instance, see Section 1 for an overview of these algorithms. Concretely, we generate random d-regular, 3-colorable connected graphs on n vertices according to an ensemble G[d, n] defined as follows. Here we assume d < 2n/3 to be even, and n to be a multiple of 3. A graph G is drawn from G[d, n] as follows: 1. Define a partition [n] = V 1 ∪ V 2 ∪ V 3 into three pairwise disjoint subsets of each size |V j | = n/3 for j = 1, 2, 3.

2.
For each pair r < s with r, s ∈ {1, 2, 3}, generate a random bipartite d-regular graph with vertex set V r ∪ V s and bipartition V r : V s . This graph is generated by iteratively going through each v ∈ V r , and adding edges (v, w 1 ), . . . , (v, w d ) where {w 1 , . . . , w d } ⊂ V s are chosen uniformly at random among those vertices in V s that (currently) have degree less than d. If at some point during this process, no such vertices are available, the generation of the bipartite graph is restarted.
3. Check whether the obtained graph contains a complete graph on 3 vertices (i.e., a triangle) and is connected. If either of these properties is not satisfied, start over.
By definition, a graph G = (V, E) drawn from G[d, n] is 3-colorable and thus the cutsize of the maximum 3-cut is equal to C max = |E| = nd/2. In particular, expected approximation ratios for QAOA 1 can be immediately computed from the expectation ψ|C|ψ of the cost function Hamiltonian. Similarly, for any approximate coloring x ∈ Z n 3 produced by RQAOA 1 (or any algorithm, for that matter), the achieved approximation ratio is C(x)/C max . We use the efficient classical algorithm by Newman [17] for comparison, as it has the best approximation guarantee (worst-case bound) among all known efficient classical algorithms for k = 3, see Section 1. Note that the theoretical lower bound 0.836008 on the expected approximation ratio of the classical algorithm [17] used here matches the one established for the algorithm in [14] in the case k = 3 and the algorithm in [9]. Details for implementation. We empirically observed that the energy landscape contains local maxima and stationary points which often prevent gradient descent in QAOA and RQAOA from finding the optimal solution. The presence of many points with a negligible gradient is a well-studied phenomenon under the name of barren plateaus [15]. Several promising strategies for alleviating this problem have been previously explored [18,24,19].
For the specific case of level-1 RQAOA applied to MAX-3-CUT, the energy function to be optimized depends on four arguments, three of which can be efficiently computed if the fourth one is known. This makes grid search an attractive and cheap alternative to gradient descent as there is only one dimension to be searched. It allows us to sidestep potential convergence problems with gradient descent and to study large graphs. In more detail, the energy E(β, γ) = ψ(β, γ)|C|ψ(β, γ) of the cost function Hamiltonian is a function of β = (β 0 , β 1 , β 2 ) ∈ R 3 and γ ∈ R. However, as we show in Appendix A, for any fixed value γ ∈ R, parameters β ∈ R 3 maximizing the function β → E(β, γ) can be found efficiently by computing roots of a degree-4 polynomial and performing a binary search on the unit circle. This significantly reduces the dimensionality of the optimization problem: only an interval of values γ ∈ R needs to be searched. Because the function to be optimized further satisfies E((β 0 , β 1 , β 2 ), γ) = E((−β 0 , −β 2 , −β 1 )), −γ) as C is self-adjoint and real, and B((β 0 , β 1 , β 2 )) = B((−β 0 , −β 1 , −β 2 )), it further suffices to restrict the grid search to the interval γ ∈ [0, π).
We thus chose 50 equidistant grid points γ 1 , . . . , γ 50 in the interval [0, π] for γ. After finding the grid point γ s that minimizes the cost function (see Appendix A), we performed another refined grid search with 50 additional equidistant grid points on the interval [γ s−1 , γ s+1 ].
In our numerical experiments we find that optimal angles for QAOA 1 for the considered random ensembles of graphs concentrate around certain values. This is in line with the analytical findings of [2] for QAOA: there it is shown that for random ensembles of 3-regular graphs, frequencies of certain local subgraphs concentrate, yielding a simple dependence of the figure of merit on the angles. Similar observations are also used in the proof of our Theorem 3.1. Numerical results. In Figures 2, 3, 4 and 5, we illustrate obtained achieved approximation ratios of QAOA 1 , RQAOA 1 and the best approximation ratio of the classical algorithm by Newman [17] over 100 samples generated in the correlation rounding step (magenta, green and blue bars). For Newman's algorithm, we additionally provide the empirical average and standard deviation of the samples, which are indicated through error bars. Also, the theoretically expected approximation ratio of α Newman = 0.836008 for Newman's algorithm is shown.

Choice of parameters.
As expected, we find that RQAOA 1 significantly outperforms QAOA 1 for the considered family of graphs. Their performance deteriorates with increasing degree d (although RQAOA 1 is less susceptible to this). This is consistent with our nogo result (Theorem 3.1), which applies to random d-regular graphs of sufficiently large (but also constant) degree d. We note, however, that Theorem 3.1 does not immediately pertain to the numerical examples considered here because of the (nonexplicit) lower bound on d and its asymptotic nature (as a function of n).
More interesting is the comparison to Newman's classical algorithm. Being a randomized algorithm, it is natural to consider the empirical means and variance of the achieved approximation ratio (although in practice, only the best coloring among a constant number of runs would be used). An analytic understanding of the exact behavior of Newman's algorithm appears to be a difficult problem. As already pointed out by Goemans and Williamson in their seminal paper [10], even computing the variance analytically appears to be challenging.
In our numerical experiments we find that the variance of Newman's algorithm increases with d and decreases with n. In cases where the variance is large, among the 100 trials considered, we typically find an essentially optimal solution 1 . We note, however, that the empirical average approximation ratio appears to be close to the theoretical worst-case guarantee α Newman .
With the observations above we can distinguish two regimes when comparing the performance of RQAOA to Newman's algorithm: In the first regime (see e.g., Figs. 3c and 4d), where Newman's algorithm has high variance, we find that the optimal solution returned is superior to the one provided by RQAOA. Nevertheless and perhaps surprisingly, the coloring output by RQAOA (which is a deterministic algorithm up to estimation errors for expectation values when used in practice) outperforms the average approximation ratio of Newman's algorithm.
In the second regime (see e.g., Figs. 2a and 5a), Newman's algorithm is strongly concentrated about its average. In this case, RQAOA 1 outperforms virtually all instances returned by Newman's algorithm. This regime includes instances with a large number of vertices, indicating that RQAOA may be a useful heuristic algorithm for problems of practical interest.
Larger levels p > 1 may further increase achieved approximation ratios of RQAOA. Unfortunately, exploring this may require new ideas or actual quantum devices. This is suggestive of RQAOA being a promising candidate for NISQ applications.

Discussion and Outlook
We formulated a variation of QAOA using qudits which is applicable to non-binary combinatorial optimization. We study its applicability by considering the MAXk-CUT problem with k > 2. We show that this version of QAOA shares certain limitations with the original proposal for qubits: its expected performance on random d-regular k-colorable graphs (for d = o( √ n)) approaches random guessing in the limit of large d. We give an efficient classical simulation algorithm for its level-1 version. This method extends to problem Hamiltonians with pairwise commuting 2-local terms. The algorithm can also be used to simulate level-1 recursive QAOA (RQAOA), a hybrid classical-quantum algorithm designed to overcome the limitations of standard QAOA.
While efficient simulability precludes the possibility of a quantum advantage compared to the best efficient classical algorithm, it offers a crucial window into the performance of QAOA and RQAOA. We use our simulation algorithm to perform numerical experiments on graphs with up to 300 vertices, obtaining approximation ratios for MAX-3-CUT achieved by level-1 QAOA and RQAOA. We compare these results to the best known efficient classical algorithms for MAX-3-CUT, specifically the recent algorithm by Newman [17]. Our experiments show that RQAOA 1 (which significantly outperforms QAOA 1 ) is competitive with Newman's classical algorithm for generic k-colorable graphs. See Section 5 for a detailed discussion.
Our observations suggest that RQAOA may be a viable application for NISQ devices combined with efficient classical computation: it may be applicable to problem sizes of real-world relevance. Our results are indicative of the potential of RQAOA at larger levels p, which may no longer be simulable classically, and which hold promise. Future work may seek to establish performance guarantees for RQAOA p akin to the rigorous results available for existing classical algorithms. As an example, we proved in [3] that for the "ring of disagrees", RQAOA 1 achieves the optimal approximation ratio. A next step in the analysis of RQAOA would be to find more interesting classes of graphs for which achievability results can be established.

A MAX-3-CUT angle optimization without gradient descent
First let us write the expectation value (22) in the form that makes its dependence on β more explicit. Recall from (6) that for each a ∈ Z k , the X-eigenstate |φ a ≡ Z a |+ is an eigenvector of the unitary B(β) to eigenvalue e iβa . Since the vectors {|φ a } a∈Z k form an orthonormal basis of C k , one gets Consider the special case O = Z r ⊗ Z −r . The identity Z r |φ p = |φ p+r gives For (β, γ) ∈ R 3 × R, let E(β, γ) = ψ(β, γ)|C|ψ(β, γ) denote the energy in the level-1 QAOA state, where the cost function Hamiltonian C is given by Eq. (24) with k = 3. Let γ ∈ R be fixed in the following. Let us write E(β) = E(β, γ). Here we consider the problem of finding max β E(β).