Combining hard and soft decoders for hypergraph product codes

Hypergraph product codes are a class of constant-rate quantum low-density parity-check (LDPC) codes equipped with a linear-time decoder called small-set-flip (SSF). This decoder displays sub-optimal performance in practice and requires very large error correcting codes to be effective. In this work, we present new hybrid decoders that combine the belief propagation (BP) algorithm with the SSF decoder. We present the results of numerical simulations when codes are subject to independent bit-flip and phase-flip errors. We provide evidence that the threshold of these codes is roughly 7.5% assuming an ideal syndrome extraction, and remains close to 3% in the presence of syndrome noise. This result subsumes and significantly improves upon an earlier work by Grospellier and Krishna (arXiv:1810.03681). The low-complexity high-performance of these heuristic decoders suggests that decoding should not be a substantial difficulty when moving from zero-rate surface codes to constant-rate LDPC codes and gives a further hint that such codes are well-worth investigating in the context of building large universal quantum computers.


Introduction
It is imperative to make quantum circuits fault tolerant en route to building a scalable quantum computer. The threshold theorem [1,20,21] guarantees that it will be possible to do so using quantum error correcting codes which encode information redundantly. This redundancy serves as a buffer against errors but we need to be mindful of the trade-offs involved as the number of qubits we can control in the laboratory is limited. A relevant figure-of-merit to quantify this tradeoff is the overhead, defined as the ratio between the number of qubits in a fault-tolerant implementation of a quantum circuit to the number of qubits in an ideal, noise-free environment.
Low-density parity-check (LDPC) codes are a natural class of codes to consider for implementations. They are families of stabilizer codes C n = { n, k, d } n such that every stabilizer generator acts on a constant number of qubits and every qubit is involved in a constant number of generators [16]. Quantum architectures that would not satisfy these conditions would probably be very difficult to scale up, for instance be-cause of difficulties to extract a syndrome faulttolerantly. A stronger restriction asks for quantum LDPC codes with geometric locality, where interactions only concern neighboring qubits in a 2 or 3-dimensional setup, but it is well known that this requirement severely restricts the ability of these codes to store information [3]. On the other hand, general quantum LDPC codes can display a constant encoding rate k/n = Θ(1), while maintaining a large minimum distance d = Ω( √ n). In a breakthrough paper, Gottesman exploited this favorable encoding rate and described a construction of fault-tolerant quantum circuits with constant space-overhead [17]. This means that if we considered an ideal circuit that processes m qubits, then its fault-tolerant counterpart will only require Θ(m) qubits.
Constructing good LDPC codes is difficult because we need to balance two competing constraints -on the one hand, we want the weight of the stabilizers to be low, but on the other hand we want the stabilizers to commute. Tillich and Zémor [35] proposed a construction called the hypergraph product code which overcomes this difficulty (see also generalizations [23,37]). This con-struction takes two good classical LDPC codes and constructs a quantum LDPC code with parameters k = Θ(n) and d = Θ( √ n). In some sense, this construction generalizes the toric code and allows one to obtain a constant encoding rate while keeping the LDPC property as well as a large minimum distance. As shown by Krishna and Poulin, there exists a framework rich enough to perform gates fault tolerantly on this class of codes [25,24]. Devising low-complexity decoding for hypergraph product codes is arguably one of the main challenges in the field right now.
In [26], Leverrier et al. have shown the existence of a linear-time decoder for the hypergraph product codes called SSF and proved it corrects errors of size O( √ n) in an adversarial setting. In [13], Fawzi et al. showed that the SSF decoder corrects with high probability a constant fraction of random errors in the case of ideal syndromes and later made these results fault tolerant by showing that this decoder is robust to syndrome noise as well [12]. To be precise, they showed that SSF is a single-shot decoder and that in the presence of syndrome noise, the number of residual qubit errors on the state after decoding is proportional to the number of syndrome errors. These works yield a rigorous proof of existence of a threshold for this class of codes, but only provide very pessimistic bounds on the numerical value of the threshold. Beginning with the seminal work of [11], statistical mechanical models have been used to make indirect estimates of the threshold of quantum error correcting codes [2,9]. Exploiting these ideas, Kovalev et al. [22] showed that certain hypergraph product codes can achieve a relatively high threshold (approximately 7 × 10 −2 ) with the minimum-weight decoding algorithm. Such an algorithm is too complex to be implemented in practice for general LDPC codes, however.
We note that some recent work from Panteleev and Kalachev investigated a quantum version of Ordered Statistical Decoding and obtained promising results for decoding small quantum LPCD codes [30].
Related work: Other families of quantum LDPC codes with constant rate include 2D and 4D hyperbolic codes: while the 2D version has a logarithmic minimum distance [15,10,5], the 4D hyperbolic codes satisfy d = Ω(n c ) for some c > 0 and can therefore be interesting for fault-tolerance [19,28,4,27]. Variants of these codes exhibit very good features and we compare our results to earlier works on hyperbolic codes.
Results and outline: In this paper, we present a new, efficient decoder for hypergraph product codes. We combine the SSF decoder with a powerful decoder for classical LDPC codes called belief propogation (BP). The resulting decoders boast low decoding complexity, while at the same time yielding good peformance. The idea behind these algorithms is to first decrease the size of the error using BP, and then correct the residual error using the SSF decoder. This paper subsumes and considerably improves upon an earlier work by Grospellier and Krishna [18]. We first study the performance of SSF by itself, and then study the performance of the hybrid decoder Iterative BP + SSF. Our simulations use a simple error model, that of independent bit-flip and phase-flip errors. When compared to [18], the thresholds of codes are significantly improved (from 4.5% to roughly 7.5%). Furthermore, since the decoder is less demanding on the underlying quantum error correcting code, the weights of the stabilizers are reduced. The stabilizers weights drop from 11 to 7. We then extend this idea to decoding in the presence of syndrome noise. In this model, we assume the syndrome bits are flipped with some probability in addition to qubits being subject to bit-flip and phase-flip noise. We find that our codes perform well using a modified decoder called First-min BP + SSF. The results are compared to the toric code, 2D and 4D hyperbolic codes.
In Section 2, we begin by providing some background and establishing our notation. We first review classical codes and discuss flip and the sum-product version of BP (simply referred to as BP), and then proceed to review hypergraph product codes, and why naive generalizations of classical decoders flip and BP fail. We introduce the SSF decoder and discuss how it overcomes these issues. Section 3 then presents some results of numerical simulations. Finally in Section 4 we discuss the results of simulations for faulty syndrome measurements.

Classical codes
In this section, we shall review aspects of classical LDPC codes pertinent to quantum LDPC codes. We begin by discussing the association between codes and graphs. We then proceed to discuss expander graphs and the decoding algorithm flip. Finally we discuss a particular version of belief propagation (BP) called the sum-product algorithm.
A classical code family {C n } n , where C n = ker H n is the binary linear code with parity-check matrix H n , is said to be LDPC if the row weight and column weight of H n are upper-bounded respectively by constants ∆ C and ∆ V independent of n. The weight of a row (or column) is the number of non-zero entries appearing in the row (or column). In other words, the number of checks acting on any given bit and the number of bits in the support of any given check is a constant with respect to the block size. These codes are equipped with iterative decoding algorithms (such as belief propagation) which have low time complexity and excellent performance. Furthermore, they can be described in an intuitive manner using the factor graph associated with the classical code and for this reason these codes are also called graph codes.
The factor graph associated with C = ker H is the bipartite graph G(C) = (V ∪ C, E) where one set of nodes V represents the bits (i.e., the columns of H) and the other set C represents the checks (the rows of H). For nodes v i ∈ V and c j ∈ C, where i ∈ [n] and j ∈ [m], we draw an edge between v i and c j if the i-th variable node is in the support of the j-th check, or equivalently if H(i, j) = 1. It follows that a code C is LDPC if the associated factor graph has bounded degree, with left degree (associated with nodes in V ) bounded by ∆ V and right degree bounded by ∆ C .
Of particular interest are expander codes, codes whose factor graph corresponds to an expander graph. Let G = (V ∪ C, E) be a bipartite factor graph such that |V | = n and |C| = m such that n ≥ m. We use Γ(c) to denote the neighborhood of the node c in the graph G. This naturally extends to a set S of nodes; Γ(S) includes any nodes connected to nodes in S via an edge in G. Furthermore, deg(c) = |Γ(c)| is the degree of a node c.
The graph G is said to be Similarly, the graph is (γ C , δ C )-right-expanding if for T ⊆ C, It is a bipartite expander if it is both left and right expanding.
In their seminal paper, Sipser and Spielman [33] studied expander codes and applied an elegant algorithm called flip to decode them. They showed that if the factor graph is a left expander such that δ V < 1/4, then the flip algorithm is guaranteed to correct errors whose weight scales linearly with the block size of the code. Furthermore, it does so in time scaling linearly with the size of the code block.
flip is a deceptively simple algorithm and it is remarkable that it works. We describe it here as it forms the basis for the quantum case decoding algorithm SSF. Let x ∈ C be a codeword and y be the corrupted word we receive upon transmitting x through a noisy channel. With each variable node v i in the factor graph, i ∈ [n], we associate the value y i . With each check node c j in the factor graph, j ∈ [m], we associate the syndrome bit s j = i:v i ∈Γ(c j ) y i (mod 2). We shall say that a check node c j is unsatisfied if the syndrome is 1 and satisfied otherwise. Note that if y ∈ C is a codeword, then all the checks c j , j ∈ [m], must be satisfied. Informally, flip searches for a variable node that is connected to more unsatisfied neighbors than it is to satisfied, and flips the corresponding bit. This reduces the number of unsatisfied checks. It is stated formally in Algorithm 1 in Appendix C (comments in blue).
The algorithm can be shown to terminate in linear time. For a detailed analysis, we point the interested reader to the original paper by Sipser and Spielman [33].
flip is not used in practice because it requires large code blocks to be effective [32]; instead we resort to BP. In what follows, we shall use the sum-product algorithm and use BP to refer to this algorithm. This algorithm is presented in Alg. 3.
BP proceeds iteratively with T iterations (described in Alg. 4) further broken down into two elementary steps. The first step (Alg. 5) involves variable nodes passing messages to checks and the second step (Alg. 6) exchanges the direction, and involves check nodes passing messages to variable nodes. We introduce some notation to refer to these objects. We let: 1. p be the error probability on the variable nodes, 2. s j be the syndrome value of the check c j (0 if satisfied, 1 otherwise), 3. m t v i →c j be the message sent from variablenode i to check-node j on iteration t, 4. m t c j →v i be the message sent from check-node j to variable-node i on iteration t, 5. λ t i is the approximate log-likelihood ratio computed at iteration t for the variable-node i: λ t i > 0 if it is more likely that the i-th variable node is more likely to be 0 than 1, On graphs with cycles, BP can only compute approximate values of the posterior probabilities. However, it turns out to be relatively precise when the length of the smallest cycle (the girth) is big enough. Thus the constraints of BP are weaker than that for flip and do not require expander graphs.

Quantum codes
We now review the definition of the hypergraph product. We proceed to discuss quantum expander codes, and the decoding algorithm proposed by Leverrier, Tillich and Zémor called SSF. We then present some earlier results of numerical simulations from [18].
CSS quantum codes are quantum error correcting codes that only contain stabilizers each of whose elements are all Pauli-X operators (and identity) or all Z [8,34]. The hypergraph product is a framework to construct CSS codes starting from two classical codes [35]. The construction ensures that we have the appropriate commutation relations between the X and Z stabilizers without resorting to topology. If the two classical codes are LDPC, then so is the resulting quantum code. In general, the construction employs two potentially distinct bipartite graphs, but for simplicity, we shall only consider the product of a graph with itself here. Let G be a bipartite graph, i.e., G = (V ∪ C, E). We denote by n := |V | and m := |C| the size of the sets V and C respectively.
These graphs define two pairs of codes depending on which set defines the variable nodes and which set defines the check nodes. The graph G defines the code C = [n, k, d] when nodes in V are interpreted as variable nodes and nodes C are represented as checks. Note that m ≥ n − k as some of the checks could be redundant. Similarly, these graphs serve to define codes C T = [m, k T , d T ] if C represents variable nodes and V the check nodes. Equivalently, we can define these codes algebraically. We say that the code C is the right-kernel of a parity check matrix H and the code C T is the right-kernel of the transpose matrix H T .
We define a quantum code Q = n Q , k Q , d Q via the graph product of these two codes as follows. The set of qubits is associated with the set The set of Z stabilizers is associated with the set (C × V ) and the X stabilizers with the set (V × C). Ref. [35] establishes the following: Lemma 1. The hypergraph product code Q has parameters: Naively generalized to the quantum realm, both flip and BP perform poorly [31]. Unlike the classical setting, we are not looking for the exact error that occurred, but for any error belonging to the most likely error class since errors differing by an element of the stabilizer group are equivalent. In the case of flip, there exist constant size errors (typically half a generator) for which the algorithm gets stuck, which implies that flip will not work well even in a random error model.
Overcoming the failure of flip: Leverrier et al. [26] devised an algorithm called small-setflip (SSF) obtained by modifying flip. This algorithm is guaranteed to work on quantum expander codes which are the hypergraph product of bipartite expanders. The algorithm is sketched out in Alg. 2 in Appendix C (comments in blue). For a detailed analysis of the algorithm, we point the reader to [26]. Note that this is not the full decoding algorithm -it has to be run separately for both X and Z type errors.
Let F denote the union of the power sets of all the Z generators in the code Q. For E ∈ F n 2 +m 2 2 , let σ X (E) denote the syndrome of E with respect to the X stabilizers. The syndrome σ X (E) ∈ F nm 2 is defined as H X E; the j-th element of this vector is 0 if and only if the j-th X stabilizer commutes with the error E. Given the syndrome σ 0 of a Z type error chain E, the algorithm proceeds iteratively. In each iteration, it searches within the support of the Z stabilizers for an error F that reduces the syndrome weight. The case of X errors follows in a similar way by swapping the role of X and Z stabilizer generators.
Ref. [26] proceeds to show that SSF is guaranteed to work if the graphs corresponding to classical codes are bipartite expanders. They prove the following theorem (Theorem 2 in [26]): constants as n and m grow. The decoder SSF for the quantum code Q obtained via the hypergraph product of G with itself runs in time linear in the code length n 2 + m 2 , and it decodes any error of weight less than Overcoming the failure of BP: While BP can be adapted to decode quantum LDPC codes, it does not perform very well. The most common behaviour when BP fails at decoding quantum LDPC codes is that it does not converge: the likelihood ratios of some nodes keep oscillating. This can be explained by the existence of some symmetric patterns in the Tanner graph which prevent BP from settling on a precise error. To circumvent this, Poulin and Chung suggested some workarounds [31] such as fixing the value of some qubits whose probabilities keep oscillating, or running BP again on a slightly modified Tanner graph where we randomly change the initial error probability of one of the qubits linked to an unsatisfied check. The idea behind both of these solutions is to break the symmetry of the code. While it does exhibit improvements, the results are still far from the performance of BP in the classical case.
An other approach to improve the performance of BP is to feed its output to a second decoder, with the hope that it will converge to a valid codeword if BP cannot. This idea was recently investigated by Panteleev and Kalachev [30] who considered a quantum version of the Ordered Statistical Decoding algorithm OSD [14]. This algorithm was imported from the classical case where it is either used alone or after BP. The idea of this algorithm is to sort the different qubits by their log likelihood ratios, a measure of their reliability, before proceeding with a brute force approach. Once OSD has sorted the qubits, it will brute force all valid corrections on the w least reliable qubits, where w is some tunable parameter, and then choose the most probable of these valid corrections (or fail if there are none). If w is proportional to the block-length, the time complexity is no longer polynomial. Instead we can use the OSD-0 algorithm which is a simplified version that reduces the error floor of BP. In practice, this appears to work almost as well as OSD-w. The time complexity is then polynomial, but may remain inappropriate for large codes.
In this work, we present some heuristic algorithms where the output of BP is fed to SSF. The idea behind these algorithms is to first decrease the size of the error using BP before correcting the residual error with the SSF decoder. In practice, if BP manages to sufficiently decrease the error weight, then SSF will often reach a valid codeword without making a logical error. We highlight that these hybrid algorithms have a time complexity far lower than that of OSD.

Ideal syndrome extraction
In this section, we study a decoding algorithm called Iterative BP + SSF. To this end, we consider hypergraph product codes subject to a simple noise model. We use classical codes generated with the configuration model, briefly described in Appendix B. We work with an independent bitflip and phase-flip error noise model, where each qubit is afflicted independently by an X or Z error with probability p. The advantage of studying such an error model with CSS codes is that it is sufficient to try to correct X errors only to understand the performance of the whole decoding algorithm. We focus here on ideal syndrome measurements. We will remove this assumption in the next section. To establish a baseline, we begin by describing the performance of SSF as defined in [26].
Grospellier and Krishna [18] studied the performance of SSF on quantum codes obtained as the  Figure 1: Variation of word error rate (WER) with the physical error for hypergraph product codes formed as product of regular (∆ V , ∆ C )-regular graphs with the SSF decoder from [18]. Logarithms are base 10 throughout the paper. The errors bars indicate 99% confidence intervals, i.e., approximately 2.6 standard deviations. (a) Codes obtained as the product of (5, 6)-regular graphs (encoding rate of 1/61 ≈ 0.016): we observe a threshold of roughly 4.5%. (b) Codes obtained as the product of (5, 10)-regular graphs (encoding rate of 0.2): we observe a threshold of roughly 2%.
hypergraph product of two (5, 6)-regular graphs and (5, 10)-regular graphs. Fig. 1 plots the logical error rate of these codes as a function of the physical error rate. In this context, the logical error rate refers to the word error rate (WER), i.e., the probability that any logical qubit fails.
In numerical benchmarks, we found a correlation between the performance of the classical codes under flip and the performance of the resulting quantum codes under SSF. The best among these codes were chosen as representatives for the quantum case and correspond to the different curves in the figure. The (5, 6)-regular codes have a threshold of roughly 4.5%, whereas the (5, 10)-regular codes have a threshold of roughly 2%. In this context, the threshold is the physical error rate below which we find that the logical error rate decreases as we increase the block size. The error bars represent the 99% confidence intervals, i.e., approximately 2.6 standard deviations. At first glance, it appears that the (5, 10)-regular codes perform much worse. However this can be attributed to a much higher encoding rate compared to the first code family (1/5 versus 1/61).
Albeit promising, we note that SSF by itself requires large block sizes before it becomes effective. This is unsurprising considering its classical counterpart also exhibits the same behaviour. As mentioned in the previous section, this shortcoming of flip is addressed in the classical case by using instead soft decoding such as BP. However, used naively, BP fails in the quantum realm. In practice, it fails at finding a valid codeword, but still manages to get rather close in the sense that the syndrome weight can be reduced by an order of magnitude.
Our idea to exploit this property is to start the decoding procedure with BP and switch to SSF after a certain number of rounds. For such an approach to work for a noisy syndrome, we need to specify a criterion to switch between the two decoders. Here, however, we consider a noiseless syndrome extraction and can apply the following simple iterative procedure: try decoding the error with SSF only; if this does not work, then perform a single round of BP followed by SSF; if this still does not work, then perform 2 rounds of BP before switching to SSF; and so on until a codeword is finally found, or when a maximum number T max of BP rounds is reached. In the latter case, we say that the decoder failed. This heuristic defines the hybrid decoder Iterative BP + SSF which is presented in Alg. 7 in Appendix E.
We now make some remarks concerning the time complexity of this algorithm. As described in Alg. 7 in Appendix E, each iteration computes one additional round of BP. To avoid redundancy in our computations, we save the output of the last step of each block of BP rounds. When we proceed to the next iteration, we only need to compute the new round. In practice most of the computation time is due to SSF. In our simulations, we choosed T max = 100.
This hybrid decoder significantly improves upon the earlier results using only SSF. Fig. 2 shows the variation of the WER versus the physical error rate using the hybrid decoder Iterative BP + SSF. The threshold appears to be at roughly 7.5%: below this value of physical error, the WER reduces as we increase the block size. The log-log plot facilitates extrapolation to low noise rates.
Interestingly, the better performance of the hybrid Iterative BP + SSF decoder compared to the SSF decoder also comes with additional features such as an increased encoding rate (from 1.6% to 4%) and a reduction of the stabilizer weight. The SSF decoder used in [18] indeed required classical codes generated from bipartite biregular factor graphs of degrees (5,6). The resulting quantum codes therefore had qubit degrees 10 and 12, and stabilizer weights 11, respectively. With the hybrid decoder, it suffices to use classical codes whose bipartite biregular factor graphs have degrees (3,4). The resulting quantum codes have qubit degrees 6 and 8, and stabilizer weights 7, respectively. This is surprising -Theorem 2 only guarantees performance of the SSF decoder if the graphs are sufficiently good expanders, which would require factor graphs with larger degrees that those we have considered. Our hybrid decoder seems to be able to get away with a much lower expansion, and therefore smaller degrees. This is important for physical implementations as higher degrees require more connectivity between different parts of the circuit.
Lastly, the word error rate is improved by several orders of magnitude, for a given block size. Compare the codes 24400, 400 generated from the (5, 6)-regular family on Fig. 1 and the 22500, 900 code generated from the (3, 4) family on Fig. 2. The encoding rate is twice as large in the second case and the code performance is significantly better for a given noise rate. For instance at p = 2%, the WER is 10 −1 for the 24400, 400 code but only 10 −3 for the 22500, 900 code.
Where do the (3, 4)-regular codes with Iterative BP + SSF stand with respect to other codes? In Table 1, we compare their performance with the toric code, the (4, 5)-hyperbolic surface code from [5], the 4D hyperbolic code from [4]. We find that the (3, 4)-regular codes have a competitive threshold of roughly 7.5% only behind the toric code. While the rate is not as good as that of [4], it has a higher threshold and lower stabilizer weights.

Dealing with syndrome noise
Although promising, the results of the previous section focus on an unrealistic problem since they assume perfect syndrome extraction. We now move on to the more relevant setting where the syndrome themselves are error prone. In addition to independent bit-flip and phase-flip noise each occurring at probability p, each of the syndrome bits is independently flipped with the same probability p. We choose the same probability for qubit and syndrome errors for simplicity. Let us immediately note that we will not be able to use the Iterative BP + SSF decoder here since it requires knowledge of whether decoding has succeeded or not (i.e., whether the syndrome is null or not) in order to stop. In the case where the syndrome is noisy, there is in general no way to know whether all qubit errors have been corrected.
Analyzing the performance of decoding algorithms with a noisy syndrome is not as straightforward as in the noiseless syndrome case, and we will in particular need to adapt our metrics. We will follow the approach of Breuckmann and Terhal [5]. When the syndrome is itself prone to error, we do not expect the output of the decoding algorithm to be an error-free code state. We consider the following scenario corresponding to a quantum computation with T layers of logical gates for instance 1 , and are interested in whether the final output is correct. For each of these T time steps, we consider both qubit noise (independent X − Z noise with error rate p) and observe a noisy syndrome (corresponding to the ideal syndrome, with each bit further independently flipped with probability p). After each time step, we use some efficient decoder Dec 1 3K\VLFDOHUURUUDWH that returns some candidate error and apply the corresponding correction. After the T steps, we want to verify whether we are close to the correct codeword. To this end, we perform error correction with the assumption that the syndrome can be noiselessly extracted. This is because we are typically interested in a classical result and simply measure the qubits directly and compute the value of the syndrome directly (no need to measure ancilla qubits in the last round). We then perform a final decoding procedure with a potentially different decoder Dec 2 . We can then estimate the threshold as a function of T , and its asymptotic value corresponds to the so-called sustainable error rate [7]. The idea is that if the physical error rate is below that threshold, then it means that one can perform arbitrarily long computations (or equivalently increase the lifetime of encoded information) by increasing the block length of the quantum codes.
As alluded to above, Iterative BP + SSF is not a valid option for Dec 1 since we would not know when to stop the decoding in general. We have experimentally tried a number of heuristics for Dec 1 and the one that performed the best is the First-min BP decoder (described in Alg. 8 in Appendix E). This decoder simply implements BP and terminates when the syndrome size stops decreasing. To be precise, it performs a round of BP, computes the estimated error and sees whether correcting for this error leads to a decrease of the syndrome weight. If so, it continues with another round and otherwise, it returns the guess made at the previous round. We have numerically considered a number of variations for the stopping criterion but this one was consistently the best option. Another possibility that we investigated for Dec 1 is to add SSF after First-min BP. This leads however to worse performance (as can be observed on Fig. 3) and increases the decoder complexity.
Having described the algorithm, we now discuss what parameters are fed to First-min BP. Recall from Alg. 3 that BP is initialized with prior information on how likely it is for each qubit to have been flipped. This is done by specifying the Log-Likelihood Ratios (LLRs) {λ 0 i } n i=1 for every qubit. If the syndromes are perfect, this algorithm proceeds to update these LLRs over several iterations. To adapt BP to a fault-tolerant setting where it is employed T times, we need to 1. specify how to initialize the LLRs at each round, and 2. how to process potentially incorrect syndrome information.
In the fault-tolerant setting, qubits are subjected to i.i.d. noise only for the first round. From the second round onwards, the noise is complicated and no longer Markovian; in addition to the potential noise at a given round, the state of the qubits depends on the results of the corrections in all the previous rounds. We make a simplifying assumption: when called at round T , BP is only fed approximate LLRs. The LLRs are computed as if the qubits were subject to i.i.d. bit-flip and phase-flip errors with rate p, thereby ignoring all sources of noise from the previous rounds. For each qubit, these LLRs prescribe a bias 1 − p to not being flipped, and p to being flipped.
Secondly, although BP was described for perfect syndromes, it can easily be adapted to take the syndrome error into account. We can simulate a Tanner graph with noisy checks by modifying the original Tanner graph. For each check node in the graph, we add a unique variable node which represents whether or not it is erroneous. We highlight some useful properties of this construction. These nodes do not create cycles, and therefore should not affect the performance of BP. They are treated like any other variable nodes, making this procedure easy to implement. Since they are linked to only one node, and therefore always send the same message. In particular, this adaptation of BP to the noisy syndrome case does not increase its time complexity.
For Dec 2 on the other hand, the Iterative BP+SSF decoder would be admissible in principle since we assume that the syndrome 7 7KUHVKROG 'HF 1 )LUVWPLQ%366) 'HF 1 )LUVWPLQ%3 Figure 3: Evolution of the threshold as a function of T . Surprisingly, the simpler procedure First-min BP performs better than First-min BP+SSF for Dec 1 with a sustainable error rate which seems to be around 3% (compared to 2.5%).
is perfect at the last step. For convenience, we chose to implement First-min BP + SSF instead, but either options are expected to yield the same threshold. A detailed review of the performance of First-min BP + SSF is presented in Appendix A, and specifically in Fig. 4. The noisy-sampling algorithm which gives the prescription to estimate the sustainable error rate is described in Alg. 10 in Appendix F. For brevity, we denote by µ(p) the Bernoulli distribution with bias p, i.e., drawing from this distribution Pr{X = 1} = p and Pr{X = 0} = 1 − p.
The simulations in the noisy syndrome case are far more time consuming than in the noiseless case since we need to plot the threshold as a function of T to estimate its asymptotic limit. To simplify the analysis, we again consider the independent X − Z noise model: as before, we only need to simulate the case of X-errors only. This simplification does not affect the threshold. Fig. 3 presents an estimate of the sustainable error rates for these algorithms. We obtain "asymptotic" values around 2.5% when Dec 1 = First-min BP + SSF and slightly above 3% when Dec 1 = First-min BP. The first algorithm seems to converge around 2.5% whereas the second one seems to converge above 3%. A plausible explanation for the worse performance for the a priori better algorithm First-min BP + SSF is that BP does not manage to correct all syndrome errors, which is an issue for the SSF decoder.
As shown in Table 2, we obtain good results in  comparison with other main families of quantum LDPC codes. Indeed the threshold is roughly the same as the toric code (2.9% versus approx. 3%).
While the stabilizer weights are lower, recall that the toric code has a zero asymptotic rate. When compared to a positive rate code such as the (4, 5)-hyperbolic codes, we cannot point to a clear winner: hyperbolic codes come with a better rate and lower stabilizer weights, but yield a lower threshold (1.3%).
We conclude by pointing out that the sustainable error rate does not tell the whole story. While not shown in Fig. 3, the WER is typically very high in the vicinity of the threshold. Similarly to the noiseless-syndrome case, very large codes are probably needed for the decoder to reach its full potential.

Conclusion
While previous studies showed that LDPC hypergraph product codes display good error suppression properties in the asymptotic regime, it is really the finite block length regime that matters for applications such as fault-tolerant quantum computation. We addressed this problem here by introducing new heuristic decoders for hypergraph product codes that combine soft information (BP) and hard decisions (SSF). Our main motivation was that BP typically fails to converge when applied to quantum LDPC codes and that SSF typically only performs well when the syndrome has low weight. Combining both decoders to let BP reduce the weight of the syndrome as much as possible, before turning to SSF to finish the decoding, leads to surprisingly good results. In the noiseless syndrome case, this combination yields an improvement from 4.6% to 7.5% for the threshold and much lower WER when compared to SSF alone, while at the same time relying on codes with higher encoding rate and lower stabilizer weights. In the noisy syndrome case, we studied a combination of BP and SSF where we use BP after each syndrome measurement to try to reduce the error and only rely on the hybrid decoder at the very last step of the procedure. The sustainable error rate that we observe in simulations is competitive with the toric code as well as hyperbolic codes.
LDPC codes are among the most versatile classical codes and come with efficient decoders with essentially optimal performance. For those codes, the hard decision decoder flip was successfully replaced by decoders such as BP that exploit soft information. It is tempting to believe that the same approach should also be true in the quantum case and that soft information decoders will convincingly replace decoders such as SSF in the future. While our results are a first step in this direction, they also call for a better understanding on how to process soft information in the case of quantum LDPC codes.

Appendix A First-min BP+ SSF
In the noisy syndrome setting, we cannot expect SSF to know whether it has succeeded and therefore cannot apply Iterative BP + SSF anymore. Rather, we want to find a heuristic criterion to stop BP after the right number of rounds, and then feed the result to the SSF decoder. The simplest possibility is to observe the evolution of the syndrome weight through the successive rounds of BP. This evolution is usually approximately periodic, displaying oscillations with the weight reaching a local minimum before increasing again.
We have empirically investigated several choices of stopping criterion, e.g., first minimum of the syndrome weight, global minimum in the 100 first rounds, and found that the best option was the first one. Stopping BP when the weight reaches its first minimum gives rise to our heuristic decoder First-min BP. The First-min BP + SSF decoder then corresponds to the case where the output of First-min BP is given to SSF. These algorithms are described in Alg. 8 and Alg. 9 in Appendix E. Fig. 4 shows the variation of the WER versus the physical error rate for independent bitflip and phase-flip noise, and ideal syndrome measurements. It is interesting to note that while the WER is degraded compared to Iterative BP + SSF, the threshold behaviour is essentially identical for both decoding algorithms since they both yield a value around 7.5%. We had initially investigated thoroughly the First-min BP + SSF 3K\VLFDOHUURUUDWH ORJ:(5 >>@@ >>@@ >>@@ >>@@ >>@@ Figure 4: Word Error Rate (WER) as a function of the physical error for a hypergraph product code formed as a product of (3,4)-regular graphs with the First-min BP + SSF decoder. The threshold is similar to that of Iterative BP + SSF. The WER is worse, however, and odd patterns at low error rates.
decoder but the error floor occurring when the physical error rate approaches 1% led us to switch to Iterative BP + SSF, for which this behaviour disappears. We found the good performance of First-min BP + SSF remarkable given the simplicity of the heuristic, and it would be worthwhile studying other variants to decide reliably when to switch from BP to SSF.

B Code construction
We describe here how we construct the regular bipartite graphs we use in our simulations. Specifically we start with the configuration model and then apply some post-processing to increase the girth of the factor graphs. This algorithm generates graphs as follows: 1. We first create an 'empty' graph with the desired number of nodes. A vertex of degree ∆ has ∆ ports. In our case, check nodes will have ∆ C ports and variable nodes will have ∆ V ports.
2. We randomly draw edges between nodes by connecting a check port with a variable port.
In the event we have two edges between a given pair of nodes, we randomly swap the double edges.
This algorithm could potentially generate bipartite graphs with small cycles. To avoid this, we use a post-processing algorithm that randomly swaps edges to increase the girth [29]. Although time consuming, this process yields good bipartite graphs.
end return E i if σ X ( E i ) + σ 0 is zero and FAIL otherwise.

E Heuristics
Algorithm 7: Iterative BP + SSF Input: Syndrome σ 0 maximal no. of BP iterations T max Output: Deduced error E if algorithm converges and FAIL otherwise.
is in the stabilizer group and FAIL otherwise