Quantum Local Search with the Quantum Alternating Operator Ansatz

We present a new hybrid, local search algorithm for quantum approximate optimization of constrained combinatorial optimization problems. We focus on the Maximum Independent Set problem and demonstrate the ability of quantum local search to solve large problem instances on quantum devices with few qubits. This hybrid algorithm iteratively finds independent sets over carefully constructed neighborhoods and combines these solutions to obtain a global solution. We study the performance of this algorithm on 3-regular, Community, and Erd\H{o}s-R\'{e}nyi graphs with up to 100 nodes.


Introduction
The Quantum Approximate Optimization Algorithm (QAOA) [1] is a hybrid quantum-classical algorithm for finding the approximate solution to combinatorial optimization problems. This hybrid approach first encodes the problem's objective function as a Hamiltonian whose ground state corresponds to the optimal solution. Then the classical and quantum processors work together within a variational loop to find the ground state. The classical computer runs an optimization algorithm which traverses the optimization landscape searching for the extrema. During the course of the optimization the quantum processor is used to evaluate the expectation value of the objective function.
For unconstrained combinatorial optimization problems the optimization is performed over the entire Hilbert space generated by the variational Teague Tomesh: ttomesh@princeton.edu Zain H. Saleem: zsaleem@anl.gov Martin Suchara: msuchara@anl.gov ansatz (i.e., a parameterized quantum circuit). A new ansatz, the Quantum Alternating Operator Ansatz (QAO-Ansatz), was proposed in [2,3] for solving constrained combinatorial optimization problems. This variational ansatz is designed in such a way that the constraints are satisfied at all times and the optimization is performed only over the space of feasible solutions.
Quantum Local Search (QLS) utilizes the QAO-Ansatz to find independent sets within small subgraphs (neighborhoods) whose size matches the capabilities of the quantum hardware. One of the main building blocks of the QAO-Ansatz is the mixing unitary which is defined up to a permutation of its components. QLS exploits this permutation freedom to search for optimal solutions within a neighborhood. The QLS algorithm draws on methods from classical local search [4] which are useful for problems where computing the global solution is intractable, but are amenable to decomposition into tractable subproblems. QLS constructs a global solution by iterating through many local subproblems and involves a dynamical update of the variational ansatz such that a constant amount of quantum resources are utilized. Local search strategies relying on graph partitioning have been previously applied in the quantum context, using the unconstrained variant of the QAOA, to optimization problems such as network community detection [5]. However, the algorithmic components introduced in this work: combining local search techniques (neighborhood initialization and ansatz construction) with the constraint-preserving QAO-Ansatz have not been previously studied.
The finite size of quantum processors necessitates the development of algorithms, such as local search, which expand the applicability of hardware to larger problems sizes. As is the case with classical computers, even when we have a large-scale fault tolerant quantum computer in hand we will wish to solve problems beyond their current capacity. In this work, we use QLS to find approximate solutions over local neighborhoods within a larger graph and combine these into a final, global solution. While a quantum speedup obtained via quantum approximate optimization remains elusive, there is evidence that classically sampling from the output distribution of the QAOA is a difficult task [6]. As the development of quantum optimization algorithms and quantum hardware continues to progress, these advancements may be incorporated within local search algorithms to continually expand the applicability of quantum computers.
We study the performance of the QLS algorithm on the Maximum Independent Set (MIS) problem [7]. MIS is one of the most widely studied constrained combinatorial optimization problems, in part, due to its broad applicability in a variety of domains and the fact that it is equivalent to other important problems such as minimum vertex cover and maximum clique on its complement graph. Specifically, our contributions include: 1. We introduce a method for constructing quantum circuits within a local neighborhood of a larger graph that are tunable to the size of the quantum hardware that is available. This tunability allows QLS to target graphs containing many more nodes than the number of qubits available in a particular quantum computer.
2. We simulate and analyze the execution of the QLS algorithm on 3-regular, Community, and Erdős-Rényi graphs -finding larger independent sets than those obtained with other classical and quantum algorithms.
3. All of the code implementing the algorithms considered in this work has been made publicly available online [8] and is also available upon request.
The remainder of the paper is organized as follows. In Section 2 we review the quantum approximate optimization algorithm and cover prior strategies for quantum constrained optimization. We introduce the quantum local search algorithm, provide its pseudocode, and an open source implementation [8] in Section 3. Section 4 provides the simulation results which compare the performance of QLS with other classical and quantum approaches. We study the impact of the free parameters within the QLS algorithm on runtime, performance, and quantum resources. Section 5 concludes and suggests future directions of this work.

Quantum Approximate Optimization
Hybrid variational algorithms, like the QAOA [1], solve optimization problems by iteratively searching through the solution space with the combined efforts of a classical and quantum computer. The classical processor runs an optimization routine and calls the quantum processor to evaluate the computationally difficult objective function. For combinatorial optimization problems such as MIS, the problem is defined on a graph with n vertices, and the graph-dependent classical objective function C(b) which we are looking to optimize is defined on n-bit The classical objective function can be written as a quantum operator diagonal in the computational basis: The expectation value of this objective function is measured with respect to the variational state, where |s is the state on which we act with unitary operators to build our variational ansatz. The ansatz in Eq. 2 is composed of two repeating parts: the phase separator unitary e iγ i C and the mixing unitary e iβ i M . The phase separator is a diagonal operator in the computational basis and typically takes the same form as the objective operator. The mixers are used to increase or decrease the amplitudes of different stateseffectively "mixing" the state of the current wavefunction. The variational parameters γ and β define the optimization landscape and correspond to the rotation angles of quantum gates within the ansatz.
For any variational state, the expectation value of C obj is evaluated on a quantum computer and then passed to a classical optimizer which attempts to find the optimal parameters that maximize E p (γ, β). Since the eigenstates of C obj are computational basis states, this maximization is achieved for the states corresponding to the solutions of the original combinatorial optimization problem.

Constrained Optimization: Maximum Independent Set
When applying variational algorithms to unconstrained optimization problems, every basis state is a valid solution and therefore the optimization takes place over the entire Hilbert space generated by the variational ansatz. In contrast, constrained optimization is restricted to those basis states which satisfy the problem specific requirements. Hybrid variational algorithms have been adapted to constrained optimization problems in two main ways. Either the objective function is modified to heavily penalize invalid basis states, effectively turning the constrained problem into an unconstrained one [9,10], or the variational ansatz is structured in a way that keeps the optimization within the valid subspace [2,3,11].
Maximum Independent Set (MIS) is an NP-Hard constrained combinatorial optimization problem defined on the graph G = (V, E) with nodes V , edges E and number of nodes n = |V | [7]. An independent set is defined as a subset V ⊂ V of the graph's nodes such that no two vertices in V share an edge. The goal of MIS is to find the independent set containing the largest number of nodes.
The Quantum Alternating Operator Ansatz (QAO-Ansatz) [2,3] is an example of an ansatz which imposes constraints at the quantum circuit level. The ansatz is constructed in such a way that we never leave the set of feasible states during the variational optimization (e.g. the set of all valid independent sets). For the MIS problem the objective function is the Hamming weight operator, and Z i is the Pauli-Z operator acting on the i-th qubit. Each vertex in the graph is assigned a value b i ∈ {0, 1} indicating whether it is excluded (0) or included (1) in the independent set.
The initial state |s for the variational optimization can be any feasible state or superposition of feasible states. Similar to the unconstrained QAOA, the phase separator unitary for the QAO-Ansatz, U C (γ) := e iγH , is constructed using the objective function. However, the mixing unitary U M (β) is non-trivial and requires multi-qubit gates for its execution.
and we have defined, where v j are the neighbors and is the number of neighbors for the ith node. We can also write our mixer as where we have usedb 2 v j =b v j . The unitary mixer above is a product of n partial mixers V i , in general not all of which commute with each other [V i , V j ] = 0. The partial mixers in Eq. 7 constitute multi-controlled X-rotations which can be implemented using multi-controlled Toffoli gates and controlled-X rotations. Examples of the phase separator, mixing, and partial mixer unitaries are shown in Fig. 1. The effect of applying a partial mixer V i (β) can be stated in words as: if all of node i's neighbors are in the |0 state (i.e. are not included in the current independent set), then rotate qubit i's state around the X-axis by an angle β.

Quantum Local Search
The Quantum Local Search (QLS) algorithm finds approximate solutions to the MIS problem on a graph G = (V, E) with n vertices by iteratively optimizing a variational ansatz over small neighborhoods within G. We give an outline of the QLS pseudocode in Alg. 1 and an implementation is available via Github [8]. local . The nodes in G 0 local are labelled to match the circuit shown in Fig. 1. After the variational optimization is complete, a new root node, n 1 is selected and will induce a new subgraph (red dotted circle). The size of the neighborhood N s can be scaled to match the problem instance or the available quantum resources.

Neighborhood Initialization
Prior to the start of the algorithm, initialize a solution bitstring S = {0} n to the all-zero state with length n; this will store the current global approximate solution and be updated throughout the course of the algorithm. Select a root node n 0 and its corresponding local subgraph where all the nodes in this subgraph are a node distance N s away from n 0 (see Fig. 2 for an example). The distance N s is a while not converged do 17 counts ← execute(U qls (β, γ) |s ); free parameter used to set the size of the neighborhood. This parameter should be set according to the density of the target graph such that the number of nodes m in the neighborhood does not exceed the number of qubits available in the quantum hardware. In this work, the qubits in the initial state |s of the neighborhood are initialized to match their current state within S, explicitly |s = i∈V local |S i . However, more interesting states could also be used while respecting the MIS constraint [12,13].

Neighborhood Ansatz Construction
For simplicity, the example ansatz construction covered here will be restricted to the case with depth parameter p = 1, but this method can be readily extended to p > 1.
composed of a single layer of the phase separator and mixing unitaries and produces a state to that used in the QAO-Ansatz. The mixer unitary is parameterized by a set of m angles β = (β 1 , β 2 , ..., β m ) corresponding to the partial mixers of each node in the subgraph.
Prior work, investigating the application of the QAOA to MaxCut problems, has shown that allowing the partial mixers to be uniquely parameterized in this way can lead to increased approximation ratio and reduced circuit depth at the expense of an increased number of variational parameters [14]. However, the QLS ansatz differs from that considered in [14] because the partial mixers do not commute with one another. This allows us to randomize over different permutations of the partial mixers to escape local minima.
Applying all m partial mixers at once may require an intractable amount of quantum resources, especially for larger neighborhoods. Instead, we use a hyperparameter N pm which sets the number of nonzero β i within β. With N pm set, we start by applying the partial mixers first to the central node n 0 , then to the nodes that are distance one away from the central node and then distance two and so on until we have either reached the quota of N pm partial mixers or exhausted the nodes in the neighborhood. Fig. 1 shows an example QLS ansatz corresponding to the neighborhood constructed in Fig. 2.
In the process of constructing U qls it is possible that we could find a high degree node within the neighborhood which has more neighbors than there are qubits available in the quantum hardware (or equivalently, exceeds the amount of allocated resources). We can handle this corner case by simply skipping the application of this node's partial mixer. This node can still participate in its neighbor's partial mixers as a control qubit.

Neighborhood Solution Search
Once the circuit construction is finished, we run the quantum approximate optimization algorithm with the goal of finding the parameters β and γ in Eq. 8 that maximize This will output a bitstring with a certain Hamming weight. Since the mixer unitary is defined up to a permutation of the partial mixers: (11) different permutations can return bitstrings with different Hamming weights. We can rerun this step r number of times, each time randomly choosing a different permutation of the partial mixers. This is a very useful strategy for escaping local minima which is only possible due to the definition of U M (β) up to a permutation -something which is not available to the typical implementation of the QAOA. From these r different rounds we select the bitstring with the largest Hamming weight, and update the solution bitstring S with the newly found local solution. This step is entirely parallelizable and multiple QPUs may be employed to simultaneously explore different permutations of the partial mixers.

Neighborhood Update
Once we have obtained an independent set on the neighborhood G 0 local we traverse the graph G by selecting a new root node. The root node n 1 , red highlighted node in Fig. 2, of the next neighborhood G 1 local , red dashed circle in Fig. 2, is randomly selected from the set of vertices that are on the edge of the current neighborhood, a distance N s away from n 0 .

Obtaining an Approximate MIS on G
We then repeat steps 1 through 4 starting with the new root node, and continue this process until all nodes have participated in the local search. Throughout the entire execution the bitstring S is continually updated with the solutions found on G i local and a global approximate solution over the full graph G. Once the entire graph has been traversed the bitstring is returned representing the final independent set I.

Methodology
We benchmark the performance of QLS, as well as additional classical and quantum algorithms, on three different graph types containing 20, 60, and 100 vertices. We consider d-regular, (p in , p out )-Community, and (n,m)-Erdős-Rényi graphs. Examples of these are shown in Fig. 3. We include the different graph types to obtain a clearer picture of algorithmic performance in a variety of circumstances.
Prior work on the QAOA has often focused exclusively on 3-regular graphs because their regularity allows for easier analysis of asymptotic performance [1]. However, in practice many graphs drawn from real-world data do not exhibit the rigid structure of d-regular graphs. For example, many social media or protein-protein interaction networks contain a community-like structure where subsets of nodes are densely connected with fewer edges existing between communities [15]. To capture this class of networks we include the community graphs generated by the planted_partition_graph() function within the NetworkX Python package [16].  This function takes as input the number and size of the communities as well as the probability of edges existing within and between communities. We consider (0.1, 0.02)-Community graphs with community sizes of 20 nodes and set the probability of an edge existing within a community to p in = 0.1 and between communities as p out = 0.02. Finally, we also include a class of random graphs known as Erdős-Rényi graphs which have been used in prior work to study the asymptotic performance of the QAOA on MIS problems [10]. We use the same construction as [10] for average degree Erdős-Rényi graphs where the number of nodes n and the average degree d are fixed and exactly dn 2 edges are randomly chosen from a uniform distribution. We consider (n, dn 2 )-Erdős-Rényi benchmark graphs with d = 3.

Keeping Simulations Tractable
All simulations were performed using the Princeton Ionic cluster [17] with a memory budget of 10GB of RAM per job. For each graph type and size n = 20, 60, 100 we generated 40 random in-stances to serve as the benchmarks. We chose the parameters of the community and Erdős-Rényi graphs such that each node has approximately three neighbors. This was done to ensure that the simulations of QLS remained tractable when increasing the size of the neighborhood and number of partial mixers utilized. By setting the graph parameters in this way, the circuits we encountered in our simulations never contained more than 25 qubits which kept the runtime and memory requirements manageable. However, despite this limitation, the benchmark graphs still display varying structure with a mix of high-and low-degree nodes as shown in Fig. 4.

Performance Metric
The typical measure of performance for MIS is the approximation ratio between the independent set found by an algorithm and the optimal MIS. Since finding the optimal MIS for large graphs quickly becomes intractable, we instead measure performance using the independence ratio defined as the size of the independent set returned by the algorithm (H(b) is the Hamming weight of the bitstring assignment of nodes in the independent set) divided by the size of the graph n [10]. We compare the performance of a number of different algorithms for MIS including quantum local search (QLS), classical local search (CLS), Boppana-Halldórsson [18], QAOA+ [10,13], and greedy random search. The implementations of each of these algorithms is provided in the following section. To mitigate the effects that our choice of classical optimizer has on the performance of the QLS and QAOA+ algorithms we repeat the evaluation of all algorithms five times on each benchmark graph and select the best run. These results are then averaged over all benchmark graphs.

Implementation
All of the code used to implement the MIS algorithms in this work have been made open source and are available online [8]. The implementation of the Boppana-Halldórsson algorithm is available via the NetworkX package [16]. Note that this algorithm evaluates the entire problem graph at once and does not contain any iterative local search component. Additionally, as a baseline for performance comparisons we implemented a randomized algorithm for MIS which randomly visits every node in the graph and greedily adds nodes to the independent set whenever possible.

Quantum Local Search
The QLS specification described in Algorithm 1 was implemented in Python with Qiskit v0.34.1 [19]. The quantum circuit simulations were performed via statevector simulation using the Qiskit Aer submodule. For the simulation results shown in this work we set the number of permutation rounds r = 3 and varied the neighborhood size N s = 2, 3, 4 and the number of partial mixers N pm = 2, 3, . . . , 10 while keeping the maximum circuit size below 25 qubits. This bound stems from our use of classical statevector simulation and the maximum N s and N pm values vary depending on the graph type.

Classical Local Search
We implemented a classical analogue of QLS which utilizes the same neighborhood initialization step described in Sec. 3. However, instead of finding an approximate MIS solution using a hybrid quantum-classical algorithm we apply the Boppana-Halldórsson algorithm to the full neighborhood. This implementation of the Boppana-Halldórsson algorithm has been slightly modified from the one provided in NetworkX to allow for the input of an initial independent set. This is useful in the local search context as the current neighborhood may contain nodes that are already within the independent set and so this information should be reflected in the Boppana-Halldórsson execution.
Additionally, the CLS algorithm requires an additional step after obtaining an approximate solution over the neighborhood to ensure that no adjacent nodes along the edge of the neighborhood were both added to the independent set. This is not an issue within QLS because while a node on the edge of the neighborhood (meaning it has at least one neighbor outside of the local neighborhood) may participate as a control qubit it does not have it's own partial mixer applied. In the CLS case, after a local solution is obtained we iterate through each node added to the indepen-  dent set and check whether any of its neighbors are also in the set. If so, we remove the current node from the independent set.

QAOA+
The QAOA was originally introduced within the context of unconstrained optimization problems such as MaxCut [1]. A modified version, known as QAOA+, was later adapted to constrained problems such as MIS by incorporating the independence constraint within the objective function [10]. The additional term added to the objective function of QAOA+ penalizes bitstrings which are invalid independent sets; effectively turning the constrained optimization into an unconstrained one. The invalid bitstrings output by QAOA+ can be pruned in an additional post-processing step. The implementation of QAOA+ used in this work is based on that given in [13].

Results
The performance of the quantum and classical algorithms over the course of the local search iterations is shown in Fig. 5. The data points indicate the average independence ratio across all benchmark graphs at the current iteration, and the shaded areas denote the standard error from the average value. Averaging the results in this way can result in sharp jumps at the final iterations, this is due to the fact that some executions may have terminated at an earlier iteration and therefore the average is performed over a smaller set of data. In Fig. 6 we remove the dependence on iteration number and report the final average  independence ratio to account for this visual artifact. The QLS algorithm was evaluated for a range of (N s , N pm ) parameter pairs, and their effect on the runtime can be seen in Fig. 5 (Fig. 7 also shows this dependence for a subset of the data). The value of N pm controls the number of partial mixers allowed within the ansatz during each iteration of the algorithm. As more quantum resources are utilized more vertices can be added to independent set during a single iteration (indicated by the slope of the lines in Fig. 5). The rate at which the independent set grows remains fairly constant during the early iterations of the local search, but begins to decline as more of the nodes are visited until finally the program halts.
The neighborhood size N s can impact this execution in a couple of different ways. For lower values of N s the neighborhood may only support a limited number of partial mixers and increasing N pm beyond this threshold no longer improves runtime. This is seen clearly in the 20 node, 3-regular graph simulations in Fig. 5 where for N s = 2 the lines for N pm = 4, 6, 8, and 10 all lie on top of one another, indicating that the local neighborhood is saturated with just 4 partial mixers. This effect also appears in Fig. 7c where increases in N pm do not result in decreases in iteration count.
Given a specific neighborhood size the exact threshold of partial mixers it can support depends on the graph type since higher degree nodes can yield more nodes within a single neighborhood. This is evident in the 20 node, Community and Erdős-Rényi data of Fig. 5 where, unlike the 3-regular case, runtime improvements can still be found between the (N s = 2, N pm = 4) and (N s = 2, N pm = 6) evaluations. Furthermore, for graphs that contain higher degree nodes larger circuit sizes are encountered even when using fewer partial mixers since a single highdegree node will translate into a partial mixer with many control qubits. Fig. 8 demonstrates this effect where 20 qubit circuits only appear during the course of (N s = 4, N pm = 10) QLS on 3-regular graphs, but are already present in (N s = 4, N pm = 6) QLS on Community graphs.
The tradeoffs between N s and N pm can have significant impacts on real-world implementations which must consider the limited qubit count of the quantum computer executing the QLS ansatz. Similarly, the selection of the (N s , N pm ) parameters can impact the tradeoff between quantum and classical computational resources: increasing N s and N pm means more of the graph can be explored at once using wider quantum circuits but requires optimizing over more parameters, on the other hand, smaller N s , N pm results in quantum circuits with fewer qubits and variational parameters but requires more iterations to fully explore the graph.
Additionally, for equal values of N pm , we see in   5 that increasing the neighborhood size can also improve the local search runtime by providing more freedom for the application of partial mixers during the neighborhood ansatz construction. In Fig. 5 this appears in the separation between same colored plots (indicating equal N pm ) with different sized neighborhoods (indicated by marker style).
In Fig. 6 we compare the average overall performance of the different algorithms. Across all graph types and sizes, QLS is able to find the largest independent sets. However, this does come at the cost of extra local search iterations. In Fig. 5 the CLS algorithm at each neighborhood size terminates after just a few iterations, this is because the local neighborhood search step of CLS is performed over the entire neighborhood during each iteration. Fig. 8 shows that the size of the neighborhoods that CLS optimizes over are on average much larger than those considered by QLS. This difference between the two local search strategies may account for the difference in their performance. The QLS approach is more conservative in the sense that while many qubits may participate in the local neighborhood optimization, many of those act only as control qubits, and only a subset of the qubits will be acted on by the R x (β) rotations that compose the partial mixers and therefore may be added to the independent set. Overall, this strategy requires more iterations of local search but results in larger independent sets when compared to CLS.

Conclusion and Future Directions
While this work has focused on MIS, the QLS approach with the QAO-Ansatz can be applied to many more constrained combinatorial optimization problems. The non-commutativity of the components of the mixing unitary within the QAO-Ansatz can be exploited by QLS for many of these problems. Additionally, while we have restricted our simulations to 100 node graphs, this technique is easily scalable to much larger problem sizes. Additionally, more efficient circuit simulation techniques will allow us to increase the size of the local neighborhoods considered during each iteration of QLS.
In this work, we evaluated the performance of QLS against greedy random search, Boppana-Halldórsson, local search utilizing Boppana-Halldórsson, and QAOA+ which are all efficient heuristic algorithms, but do not promise the best solutions that are obtainable via other available classical techniques. We demonstrated the applicability of QLS to large problem sizes, but there are still many improvements that can be made to the neighborhood solution search and neighborhood update steps by drawing upon the extensive work that has been done in the classical community on local search [20,21,22,4]. One advantage that QLS maintains over classical approaches to local search is its ability to exploit quantum entanglement to search the solution landscape within a neighborhood all-at-once instead of one-by-one.
Finally, implementations of the QLS algorithm on real quantum hardware will be critical for studying the impacts of noise and compilation on the algorithm's performance. The multiqubit gates required to implement this algorithm can be quite expensive when decomposed into the single-and two-qubit gates required by current superconducting and trapped-ion architectures [23,24]. Using well-known decompositions, an n-controlled X-rotation requires O(28n) CNOTs [25]. However, emerging quantum computer architectures, such as neutral atoms [26], are especially promising because they support the ability to natively implement multi-qubit operations with n > 2 controls. Recent experimental efforts have also been working toward support for larger multi-qubit operations in superconducting [27] and trapped-ion [28] architectures. More efficient decompositions are also possible by moving beyond two-level qubits and representing information with higher-dimensional systems such as qutrits. For example, an n-controlled unitary may be implemented using qutrits and O(3n) two-body entangling gates [29].
Local search methods are an effective means of expanding the reach of current quantum systems towards larger and larger problem sizes. Here we have demonstrated the effectiveness of combining techniques from classical local search with quantum approximate optimization for the MIS problem. Progress in both quantum algorithms and hardware is proceeding rapidly, and synthesizing these advancements into a practically useful application remains a challenging open problem.