Finite-rate sparse quantum codes aplenty

We introduce a methodology for generating random multi-qubit stabilizer codes based on solving a constraint satisfaction problem (CSP) on random bipartite graphs. This framework allows us to enforce stabilizer commutation, $X/Z$ balancing, finite rate, sparsity, and maximum-degree constraints simultaneously in a CSP that we can then solve numerically. Using a state-of-the-art CSP solver, we obtain convincing evidence for the existence of a satisfiability threshold. Furthermore, the extent of the satisfiable phase increases with the number of qubits. In that phase, finding sparse codes becomes an easy problem. Moreover, we observe that the sparse codes found in the satisfiable phase practically achieve the channel capacity for erasure noise. Our results show that intermediate-size finite-rate sparse quantum codes are easy to find, while also demonstrating a flexible methodology for generating good codes with custom properties. We therefore establish a complete and customizable pipeline for random quantum code discovery.


Introduction
Quantum error-correcting codes are an essential prerequisite of reliable quantum computing. While codes like surface codes [10,18,30] and color codes [25,26] are sufficient to encode and protect a small constant number of qubits, their performance degrades rapidly when performing computations with more qubits [12,15]. While quantum errorcorrecting codes with asymptotically good performance have been known to exist for more than 20 years [7,14], these codes require measurement of Stefanos Kourtis: stefanos.kourtis@usherbrooke.ca operators whose weight increases with block size. This is a major limitation to efficient implementation of measurement circuits for these codes.
More recently, Gottesman showed that finiterate quantum low-density parity-check (LDPC) codes can be used to build constant-overhead faulttolerant quantum computers [21]. Quantum LDPC codes are stabilizer codes with bounded-weight stabilizers. That is, they involve only bounded-weight measurements.
This seminal result together with the plethora of asymptotically optimal classical LDPC code constructions [19,28] led to a surge of research toward quantum LDPC codes inspired by their classical counterparts. This resulted in many quantum LDPC code constructions such as hypergraph product codes [31], homological product codes [11] and fiber bundle codes [24]. Finally, more than 20 years after the introduction of the first quantum error-correcting codes, Panteleev and Kalachev [27] introduced the first family of quantum LDPC codes with optimal asymptotic performance.
These constructions are based on homological products of classical or quantum error-correcting codes. While there is some flexibility in the choice of the input codes, these methods are hard to adapt to precise connectivity limitations of near to mid-term quantum devices. Furthermore, they are generally more relevant for a larger number of qubits. For example, recent numerical studies of hypergraph product codes [22,33] involve a few thousands to hundreds of thousands of qubits. Other constructions such as building codes from low-depth random circuits [13,23], have been proposed but, to our knowledge, no procedure generates the stabilizer operators directly while considering arbitrary physical limitations.
In this work, we introduce a methodology for the generation of an arbitrary number of stabilizer operators directly for any given number of qubits. We reformulate code generation as a constraint satisfaction problem (CSP) on random bipartite graphs of varying edge inclusion probability, whose vertices correspond to qubits and stabilizers. The constraints imposed correspond to desired code properties, including, but not limited to, stabilizer commutation, X/Z balancing, finite rate, sparsity, and maximum-degree bounds. The resulting CSP is akin to paradigmatic NP-complete problems like random k-SAT and random 2-coloring of k-uniform hypergraphs. Although these CSPs are hard to solve in the worst case, they often display a transition between a hard phase, where problem instances are hard to solve on average, and an easy phase, within which typical instances can be solved in polynomial time, as a function of some parameter [5,6]. This motivates us to ask whether our code generation CSP on random bipartite graphs also exhibits a transition to an easy phase as a function of some parameter, which in the present case we choose to be the edge inclusion probability of the sampled graphs.
To define our problem instances, we sample bipartite graphs at random with varying edge inclusion probability, define appropriate constraints on qubit and stabilizer vertices that ensure our desired code properties, and then solve the resulting instances using a state-of-the-art CSP solver. We discover a satisfiability threshold at a critical edge inclusion probability that decreases with increasing number of qubits. We therefore show that the CSP has an easy phase, whose extent grows with increasing number of qubits and within which we readily find finite-rate stabilizer codes. Furthermore, we find that the resulting codes are much sparser than the initial graphs. Going a step further, we show that we can guarantee code sparsity via maximum-degree constraints, at the expense of a computational overhead. Finally, we demonstrate that the codes we obtain using our methodology achieve the erasure channel capacity. We therefore establish a complete and customizable pipeline for random quantum code discovery that can be geared towards near-to mid-term quantum processor layouts.
The rest of this paper is organized as follows. In Section 2, we recall notions of quantum coding theory and stabilizer code construction. In Sec- Figure 1: The Tanner graph of the 9-qubit Shor code. Purple circles represent qubit vertices and green squares and diamonds represent respectively X and Z stabilizer generators. The top right stabilizer vertex corresponds to the Z 7 Z 8 operator and the bottom left vertex corresponds to the X 0 X 1 X 2 X 3 X 4 X 5 operator. tion 3, we introduce our methodology for the generation of stabilizer codes through solving a CSP on random bipartite graphs. In Section 4, we present an explicit construction of the constraints for CSS codes. Finally, we present numerical results on code discovery and near-optimal error correction performance for the erasure channel in Section 5.

Stabilizer codes
Given the n-qubit Pauli group P n , a stabilizer group is a commuting subgroup of P n that does not include the −I operator. A stabilizer group S defines the stabilizer code [20] C(S) = {|ψ | S |ψ = |ψ , ∀S ∈ S}. (1) That is, a stabilizer code is the common +1 eigenspace of the operators of a stabilizer group. We often define a stabilizer group using a set of generators g(S) = {S 1 , S 2 , ..., S m } with S i ∈ S. A family of codes is a finite-rate family if there exists a constant c > 0 such that n−m n > c for n → ∞. Finite-rate code families are crucial for achieving large-scale fault-tolerant quantum computation since they allow encoding a constant ratio of logical qubits to physical qubits with vanishing error rate. This is in contrast with zerorate code families such as surface codes, which can only encode a small constant number of logical qubits without significantly degrading errorcorrecting performance [12].
A common class of stabilizer codes are Calderbank-Shor-Steane (CSS) codes [14,29]. These are codes for which there exists a set of generators such that each of them is a product of only I and X or only I and Z. While the techniques introduced in this work are applicable to both CSS and non-CSS stabilizer codes alike, in what follows we focus mainly on CSS codes as this simplifies the presentation of our methodology. We discuss the case of non-CSS stabilizer codes in Sec. 6.
Below we make use of the graphical representation of stabilizer codes based on Tanner graphs, illustrated in Figure 1. A Tanner graph T = (Q∪g(S), E) is a bipartite graph that contains edge {q, S} ∈ E if and only if the stabilizer generator S acts as X, Y or Z on qubit q. The degree of a vertex is its number of neighbors. In error-correction terms, the degree of a stabilizer corresponds to its weight and the degree of a qubit corresponds to the number of stabilizers acting on it.

Stabilizing edge coloring
To introduce our algorithmic approach to principled search for finite-rate stabilizer codes, we reformulate the task as a graph coloring problem. Our strategy is to start with a bipartite graph G = (Q ∪ g(S), E 0 ), called the support graph, then search for an edge coloring l : E 0 → {I, X, Y, Z} for which all stabilizers commute. We call such a coloring a stabilizing edge coloring. The result of this procedure is a Tanner graph T = (Q ∪ g(S), E) where the edge e ∈ E iff l(e) = I and hence E ⊆ E 0 . Deciding whether there exists a stabilizing edge coloring for a given support graph G is equivalent to determining whether there exists a valid stabilizer code whose Tanner graph is a subgraph of G as defined above.
As formulated above, stabilizing edge coloring admits trivial solutions. For example, assigning the same color to all edges is always a valid solution, but is of little interest for quantum error correction. To avoid such trivial solutions, or other undesirable solutions, we must introduce additional constraints. We represent a constraint as a pair (F, L) where F ⊆ E 0 and L ⊆ {I, X, Y, Z} |F | . A coloring l satisfies a constraint (F, L) for F = e 1 , . . . , e |F | if (l(e 1 ), ..., l(e |F | )) ∈ L. We can now define the constrained stabilizing edge coloring problem we are interested in.
Definition 1 ((Constrained) stabilizing edge coloring). Given a bipartite graph G = (Q ∪ g(S), E 0 ) and a set of The problem is constrained whenever F is not empty. Table 1 presents estimates of |F| for the instances we study numerically. This defines a rich class of constraint satisfaction problems. In the following section, we provide concrete realisations of unconstrained and constrained stabilizing edge coloring which we solve to generate non-trivial CSS codes.

CSS codes from constraints on boolean variables
For CSS codes, we simplify stabilizing edge coloring by assigning Pauli values l p : g(S) → {X, Z} to stabilizer vertices and boolean values l a : E 0 → {0, 1} to edges. Then, an edge e ∈ E 0 has color l(e) = I when l a (e) = 0 and color l p (e) otherwise. We say that an edge e is active if l a (e) = 1. In this representation, the active neighborhood of a stabilizer S is the set of qubits for which {q, S} ∈ E 0 and l a ({q, S}) = 1. Then, two stabilizers with different Pauli values commute if the intersection of their active neighborhoods has even cardinality.
To solve the stabilizing edge coloring problem numerically, we represent the coloring functions l p and l a with boolean variables. To each edge e ∈ E 0 , we assign a boolean variable v a (e) ∈ {0, 1} such that l a (e) = v a (e). To each stabilizer S ∈ g(S), we assign a boolean variable v p (S) such that l p (S) = X if v p (S) = 1 and l(S) = Z otherwise. We call v a an activator variable and v p a Pauli variable. We also introduce some auxiliary boolean variables. The auxiliary variable v s (S, S ) = 1 when the action of S and S is the same. The variable v e (S, S ) = 1 when S and S have an even overlap. Finally, the variable v b (q, S, S ) = 1 when both edges {q, S} and {q, S } are active.
We are now ready to introduce the boolean constraints that define the stabilizing edge coloring problem for CSS codes. In the rest of this section, we represent a constraint as a boolean function c : {0, 1} * → {0, 1} acting on a subset of the boolean variables v a , v p , v s , v e , and v b . An assignment x ∈ {0, 1} * satisfies a constraint if c(x) = 1. Boolean constraints allow us to represent the stabilizing edge coloring constraints of Definition 1.
Stabilizers S and S commute if they satisfy at least one of the following conditions: they have the same non-trivial action (X or Z) or, as discussed previously, they act non-trivially on an even number of common-neighbor qubits. We can thus write a commutation constraint c c (S, S ) as That the variable v s (S, S ) = 1 when the action of S and S is the same is enforced by the constraint Similarly, to ensure that v e (S, S ) = 1 when S and S have an even overlap, we add the constraint Equations (2) to (5) define a set of constraints on activator, Pauli and auxiliary variables. We draw the variables and constraints for a small example in Figure 2. When simultaneously satisfied, they ensure that the coloring function l defines a stabilizing edge coloring. Thus, any variable assignment for which all constraints evaluate to 1 yields a valid CSS code. In Section 4.2, we present how we find such assignments.

Extra constraints for good codes
In this section, we introduce extra constraints to restrict the search to better codes. We represent these contraints using integer linear functions c : Z * → Z restricted to D ⊆ Z. That is, an assignment x ∈ Z * satisfies constraint c if c(x) ∈ D. We define these constraints by extending the boolean variables of the previous section to be integer variables.
We first add lower bounds on the number of stabilizers of each kind connected to a qubit. For each edge {q, S} ∈ E 0 , we add a variable v X ({q, S}) with value 1 if the edge is active and the corresponding stabilizer is of the X type. This is enforced by the constraint Then, we impose that the number of variables v X ({q, S}) with value 1 for S ∈ η(q) is at least δ q . That is, we add the constraint for each qubit q. We use similar constraints to lower bound the number of Z stabilizer generators per qubit. These constraints together with the commutation constraints are the first set of constraints we numerically study in the following section. Subsequently, we add more constraints to search for codes with improved decoding performances.
We impose to each stabilizer S ∈ g(S) that the number of variables v a ({q, S}) with value 1 for q ∈ η(S) is at least δ s . Then, to keep the codes sparse, we impose that at most ∆ s edges per stabilizer are active. These are both enforced by constraints akin to Equation (7).
Finally, we impose that the number of stabilizers of each kind is balanced by enforcing that

Constraint satisfaction problem solver
To obtain CSS codes from the aforementioned constraints, we use the OR-Tools library [3]. We decompose the constraints of Equations (2), (5), and (6) into a constant number of OR constraints and we leave those of Equations (3) and (4) as XOR constraints. Both OR and XOR constraints, as well as the linear constraints used to enforce the minimum and maximum degree and balancing, are native to the library. We provide our implementation in an online repository [4].

Phase transition
To search for codes with n qubits and m stabilizer generators, we start by building random support graphs with the corresponding numbers of vertices using the Erdős-Rényi model [17]. That is, we sample the graphs with an edge inclusion probability of γ. We use G n,m,γ to denote the corresponding random bipartite graph generator. Our goal is to find . For each pixel we generate 100 support graphs, then solve the corresponding stabilizing edge coloring instance for each graph. For each support graph, we run the CSP solver on four CPU cores running at 2.4 GHz for up to four hours. A green pixel indicates a combination of edge inclusion probability and number of qubits for which the solver is able to solve less than 10% of the instances within the allocated timeout. In the rest of the phase diagram, where more than 90% of the instances are solved for each combination of parameters, a blue pixel indicates more satisfiable than unsatisfiable instances, whereas an orange pixel indicates the opposite: more unsatisfiable than satisfiable instances.
sparse codes for given n, m, and γ, or obtain a proof that such codes are statistically unlikely.
One can see that if E ⊆ E and G = (V, E) ∈ P, where P denotes the existence of at least one stabilizing edge coloring, then for any graph G = (V, E ), we have G ∈ P. Thus, Pr[G n,m,γ ∈ P] ≤ Pr G n,m,γ ∈ P (9) when γ ≤ γ and there exists a threshold function γ * (n) such that [9] lim n→∞ Pr[G n,m,γ ∈ P] = 0 γ(n)/γ * (n) → 0, Our numerical results, discussed below, suggest this threshold is a non-increasing function of the number of qubits. We start by imposing only the commutation constraints and a minimum qubit degree. That is, we impose δ q = 3, δ s = 0 and ∆ s = ∞ for graphs with fixed ratio m/n = 9/10, yielding codes with rate 1/10. If no solution was found and no proof of unsatisfiability was produced within a timeout, we label the instance as unknown. We note that most instances are solved in a fraction of the timeout, except close to the threshold. In Table 1 we give estimates for the number and weight (in number of variables) of each type of constraint in the CSS stabilizing edge coloring for the Erdős-Rényi model. Figure 3 illustrates that the threshold γ * is decreasing with the number of qubits. While we cannot definitively claim that the threshold does not start increasing for larger block sizes, we expect that finite-size effects are insignificant for larger numbers of qubits. This implies that γ * either decreases monotonically or it plateaus to a value of at most 15%. The unknown region demarcates the parameter regime where typical instances of the CSP become hard to solve and our calculations time out. Crucially, Figure 4 shows that the densities E mn of the Tanner graphs of the codes go to zero as the number of qubits increase and are much lower than the edge inclusion probabilites of the support graphs. We observe similar results for different values of m/n. We now restrict the problem further to the regime of quantum LDPC codes. That is, we impose δ q = 3, δ s = 6 and ∆ s = 20 together with the stabilizer balancing constraint. Figure 5 indicates that the threshold function is non-increasing when increasing the number of qubits. We thus once again find a satisfiable phase that expands with increasing number of qubits and within which it is statistically likely that quantum LDPC codes exist and are easy to find.
We also investigated the impact of different degree bounds. However, due to significant increasing in required computing ressources, we limited ourself to much fewer samples, but we still observed similar results (see Appendix A). Also note that in practice, it is not necessary to perform a large sampling if one is interested only in finding a single good code. Finally, although a maximum degree of 20 could be too large for some pratical applications, the codes used in the decoding experiment of the following section have average degrees between 8.4 and 12.7.

Decoding experiment
We showed numerical evidence that even if it is generally hard to find commuting sets of random stabilizer operators, it is relatively easy to generate such a set from a random bipartite graph as long as the edge inclusion probability of the graph is above some threshold function. However, this result implies nothing about the decoding performance of the code generated this way.
In this section, we probe the performance of the codes we find, using the erasure channel as suggested in [16,23]. The erasure channel is particularly useful since there exists a maximum-likelihood decoder that runs in polynomial time. In contrast, we do not yet have a good universal decoder for the depolarizing channel.
The single-qubit erasure channel, erases a qubit with probability p by replacing it with a maximally mixed single-qubit state. The second register indicates whether a qubit has been erased. Since the multi-qubit version of the channel can be written as where P n (e) is the set of n-qubit Pauli operators with a trivial action on every qubit q i for which e i = 0. Then, after a measurement of the second register identifying erasure positions, the channel reduces to a Pauli channel on the erased qubits where each error is equally likely. Therefore, from a measurement of the syndrome, a maximum likelihood decoder searches for any Pauli operator restricted to the erased region with the appropriate syndrome.
The success probability of this decoder is the inverse of the number of logical operators that cannot be moved out of the region of erased qubits by multiplying with a stabilizer. This probability can be computed by Gaussian reduction as described in [23].
The capacity, i.e. the maximum rate of information, of the erasure channel [8] is computed from the probability of erasure. That is, the capacity of the channel with erasure probability p is R max = 1−2p. Inverting this relation, we observe that a code family with rate R can be used to protect information as long as the erasure probability is at most 1−R 2 . Figure 6 illustrates that the rate 1/10 codes obtained with our methodology achieve this limit. In other words, we have a procedure to construct random sparse stabilizer codes that are essentially capacity-achieving for the erasure channel.

Discussion and outlook
In this work, we reformulate the search for finiterate sparse stabilizer codes as a constraint satisfaction problem, which we call stabilizing edge coloring, that is amenable to solution with state-of-theart CSP solvers. We believe future work can exploit the connection between quantum error correction and constraint programming that we establish here   in order to guide the search for new families of random stabilizer codes. We note that even though here we limit our analysis to CSS codes for the sake of simplicity, the method we introduce is easily adaptable to more general stabilizer codes. For example, one can assign an extra variable to each edge to represent its Pauli value instead of labelling the stabilizer vertices directly. Furthermore, the method is flexible in the sense that it is simple to incorporate constraints that take into account multiple factors on the same footing, including, for example, hardware limitations like qubit layout and connectivity.
In this work we had to strike a balance between computational resources and a reasonable timeframe for completion of numerical calculations. Targeted searches for larger codes within the satisfiable phase using more resources are a straightforward direction for future work. Impressive progress in the performance of solvers in the last decades [1,2] means that larger codes could soon be discoverable by our methodology with moderate resources. Customized CSP solvers that resolve the types of constraints involved in stabilizing edge coloring could also boost the search for codes with desirable properties.
It is also interesting to study different random graph constructions as input to CSP solvers instead of the uniformly sampled graphs we used. This could lead to sparser initial graphs with structure favorable to the discovery of good quantum codes.
Finally, we were able to use our approach to construct a family of finite-rate codes achieving optimal threshold for the erasure channel. This result, together with many recent stabilizer code constructions, motivate the search for more generally applicable quantum decoders for more complex noise channels such as the depolarizing channel and correlated Pauli noise. This would allow us to probe the performance of random code constructions more thoroughly. : Satisfiability phase diagrams with balancing contraints and different degree bounds. Each pixel is generated by sampling 10 support graphs and all parameters and details are the same as in Figure 3.

A Exploration of different degree bounds
In the main text, we argue that other choices of degree bounds lead to qualitively similar results. To support that argument, we show in Figure 7 satisfiability phase diagrams for three other set of degree bounds. Each diagram is generated using the same hardware and time limit as those of Figure 3 and Figure 5.
In all three scenarios, we observe similar transition between the satisfiable and unsatisfiable phases. Due to computational restrictions, these diagrams are generated using ten samples per pixel. This induces a relatively large unknown region in Figures 7b and 7c which are both obtained from CSP instances strictly harder than the one presented in the main text. However, more samples and a longer time limit, would narrow down both unknown phases to a smaller area. This would require significantly more computational resources and is left to further investigation.