Blueprint for a Scalable Photonic Fault-Tolerant Quantum Computer

Photonics is the platform of choice to build a modular, easy-to-network quantum computer operating at room temperature. However, no concrete architecture has been presented so far that exploits both the advantages of qubits encoded into states of light and the modern tools for their generation. Here we propose such a design for a scalable and fault-tolerant photonic quantum computer informed by the latest developments in theory and technology. Central to our architecture is the generation and manipulation of three-dimensional hybrid resource states comprising both bosonic qubits and squeezed vacuum states. The proposal enables exploiting state-of-the-art procedures for the non-deterministic generation of bosonic qubits combined with the strengths of continuous-variable quantum computation, namely the implementation of Clifford gates using easy-to-generate squeezed states. Moreover, the architecture is based on two-dimensional integrated photonic chips used to produce a qubit cluster state in one temporal and two spatial dimensions. By reducing the experimental challenges as compared to existing architectures and by enabling room-temperature quantum computation, our design opens the door to scalable fabrication and operation, which may allow photonics to leap-frog other platforms on the path to a quantum computer with millions of qubits.


Introduction
On the path to building a scalable fault-tolerant quantum computer, photonic technologies promise important advantages over other approaches. These include (i) the possibility of room-temperature computation, which allows for full miniaturization, mass manufacturing, the use of inexpensive off-the-shelf components, faster operation, and a more rapid scaling to large numbers of qubits by adopting existing silicon electronics and photonics technology; (ii) intrinsic compatibility with communication technology, which enables highfidelity connections between multiple modules without noisy transduction steps required in other platforms; and (iii) flexibility in the choice of error-correcting codes, including both the mode-to-qubit encodings and high-dimensional qubit codes using the temporal degrees of freedom of light. These advantages motivate serious consideration of architectures for photonic quantum computation.
Current architectures for scalable and universal photonic quantum computing live on two extremes. The first type leverages the impressive scalability of continuous-variable (CV) entangled resource states to implement computation on discrete-variable (DV) information (specifically, qubits) encoded in bosonic modes [1,2]. While the type of CV resource required for this first approach can be produced deterministically and scalably, these architectures require DV resources also to be generated on-demand and deterministically, which imposes infeasible hardware requirements. The second type of architectures involves generating entan-

Overview of Architecture
Photonics is currently the only platform that enables building room-temperature, modular, and easily-networked quantum computers. The advantages of photonics are augmented by using qubits that are encoded into the state of light using a method proposed by Gottesman, Kitaev and Preskill (GKP). These so-called GKP qubits are a leading candidate for optical quantum computation because: (i) an important class of gates, operations, and measurements on these states can be performed with Gaussian resources, which are natively available and easy to implement on integrated photonic devices, and (ii) they are inherently robust to noise and optical losses. Moreover, computation with GKP qubits can be performed at room-temperature, which makes them especially attractive for the scalable fabrication and operation of quantum computers.
Current proposals for quantum computation with optical GKP qubits rely on continuous-variable (CV) cluster states, which are entangled states of many modes of light. These states have been demonstrated experimentally on vast numbers of optical modes. Measuring a CV cluster state can be used to perform Gaussian operations, but non-Gaussian components are required for a fully-fledged, universal computer. This non-Gaussian component is provided by the GKP qubits, whose generation requires a resource beyond those already mentioned. Though the optical properties of integrated photonic platforms are insufficient to provide such a non-Gaussian resource, recent technological developments in photon-counting detectors can save the day.
Specifically, GKP qubits can be produced by Gaussian boson sampling (GBS) devices. These devices displace, squeeze, and interfere light -all Gaussian operations -and then guide it toward photon-counting detectors. When photons in all but one mode of the light are counted, the light in the unmeasured mode emerges in something approaching a GKP qubit, as long as a specific photon-number pattern is observed in the detectors. Although such a process is probabilistic, that is, conditioned on the observation of this pattern, many GBS devices can be run simultaneously to boost the likelihood of making a GKP qubit. But even with this approach, termed multiplexing, creating a GKP qubit with near certainty requires very many GBS devices. This requirement hinders existing photonic architectures, which require that GKP qubits be available on-demand.
We propose a scalable architecture for fault-tolerant measurement-based quantum computation that overcomes this severe limitation of GKP qubit production. Our method exploits a hybrid resource state comprising GKP qubits at some modes and squeezed states of light at others. Multiplexed GBS devices are still used to generate GKP qubits; however, when these devices fail, the mode is instead guaranteed to be prepared in a squeezed state. This mode becomes entangled with the others, as it would in a CV cluster state. Computation can still be performed on this squeezed-state mode but now the number of GBS devices needed is no longer prohibitive. Photonic quantum computation using hybrid resource states A planar chip (top) generates a resource state for faulttolerant quantum computation. The optical modes comprising the lattice are either GKP states of light (red dots) or squeezed light (blue dots); whenever the former is unavailable -its generation is probabilistic -the latter is guaranteed to be there. The light is measured at homodyne detectors (bottom), whose output is carefully decoded. Measurement settings are changed accordingly to perform measurement-based quantum computation.
Introducing Gaussian neighbours to the GKP modes of the cluster state leads to one complication though. When a GKP mode is measured to perform a gate, its intrinsic structure helps reduce the noise in the quantum state through the wellknown process of GKP error correction. But when squeezedstate modes are measured, a known amount of random noise is injected into the neighbouring modes, which might degrade the quality of the computation if not accounted for. To tackle this problem, we introduce a novel decoding procedure for the hybrid cluster state. Our decoder takes the noisy measurement values and uses the knowledge of the squeezed state locations in order to produce betterinformed qubit readout values. Then, usual qubit decoding techniques can be applied to correct any errors that arise in the computation.
Thus our architecture enables scalable fault-tolerant quantum computation with optically-generated GKP states or squeezed states of light. More than that, it uses a roomtemperature moderately-sized planar photonic chip, which drastically reduces the difficulties in a scalable fabrication and running of the computer. Our planar architecture also satisfies a crucial requirement towards fault tolerance: ensuring that any noise does not extend beyond a qubit's neighbours. These advantages may allow photonic quantum computation to leap-frog other platforms in the quest to build a scalable fault-tolerant universal quantum computer operating on millions of qubits. gled resource states made entirely out of the bosonic qubits themselves [3][4][5][6][7]. Resources that exclusively comprise high-quality bosonic qubits are endowed with a degree of resilience to noise, but are more difficult to make, as either the scalable generation of the qubits or the operations required to combine them into larger states are probabilistic. In light of this, it is important to devise a scheme that combines the best of both worlds: using CV resources to ease the burden on the preparation of bosonic qubits, but retaining a sufficiently high concentration of bosonic qubits to ensure low-noise operations whenever the CV modes are consumed. Here we present such a scheme and analyze how the robustness to error depends on the relative concentration of bosonic qubits in the entangled resource state.
The first type of architectures aim to use a division of labor between Gaussian and non-Gaussian resources (see Table 1). The Gaussian resource is provided by easy-to-generate CV cluster states, which are multi-mode Gaussian states [8]. There has been substantial progress in designing and deterministically generating CV cluster states in one [9][10][11], two [12][13][14][15][16], and higher dimensions [17][18][19]. In each of these architectures, the quantum information is encoded in a bosonic qubit introduced by Gottesman, Kitaev and Preskill (GKP) [20]. Clifford circuits-which make up the majority of operations required for a fault-tolerant quantum computer-can be implemented via measurementbased quantum computation (MBQC) on the CV cluster state. By circumventing the need for an entangled resource state made entirely out of encoded qubits (as required by the second class of architectures), this approach partially alleviates the burden on the GKP state sources. However, a truly regular supply of GKP encoded states is still required; they provide the necessary non-Gaussianity, implement non-Clifford gates, and correct CV errors. Thus far, prior work has required that such qubits can be supplied and coupled to the cluster state deterministically at regular intervals.
The second type of architectures includes the schemes developed for the cat-basis encoding [4,21], the GKP encoding [22,23], and the dual-rail encoding [24]. These approaches must contend with the non-deterministic generation of individual qubit states, particularly in the former two cases where the states have a complicated structure. The latter case has the added challenge of non-deterministic entangling or "fusion" gates, which are required to grow a cluster state. Each gate is eventually implemented by consuming probabilistically generated photons, which imposes formidable multiplexing requirements for cluster state generation-unlike schemes for generating CV cluster states.
Any reliance of either type of architecture on deterministic sources of optical GKP qubits is at odds with the current state of theory and technology. The numerous procedures for generating optical GKP states that have been proposed tend either to be non-deterministic, as they rely on post-selected measurements directly [25][26][27][28][29][30] or indirectly [31][32][33]; or require the experimentally challenging conditions of coherent interactions with matter [34,35] or extremely strong optical nonlinearity [35]. Recent advances in photon-number-resolving (PNR) detectors [36][37][38][39] have substantially improved the viability of the post-selection approach in the near term, with methods based on Gaussian boson sampling (GBS) [27][28][29][30] now within reach of state-of-the-art optical devices. Low-probability sources can be improved with the help of multiplexing at the cost of an increased overhead. That is, in order to generate states with near-unit probability 1 − p 0 , the number of required state generation devices scales as log(1/p 0 ), which is prohibitively large as p 0 → 0.
In this work, we propose an architecture for measurement-based quantum computing that possesses the advantage of CV-based schemes and yet is compatible with probabilistic GKP qubit sources. We consider a hybrid CV cluster state where each mode is substituted with a GKP qubit at random and with probability (1 − p 0 )-or said the other way, a qubit cluster state where each node is substituted with a squeezed state with swap-out probability p 0 . The precise state we consider is the lattice from the Raussendorf, Harrington, Goyal (RHG) model [40][41][42], but our scheme can readily accommodate other error-correcting codes. Our use of CV resources affords us an important alternative over existing approaches, wherein a qubit that failed to be produced must be erased from the lattice. Instead, we replace the no-show qubit with a squeezed vacuum state: it can still encode logical information (albeit not as well as a GKP state [43]) but has the distinction of being Gaussian and thus easily producible 2 . This approach -one of the main innovations in this work -propels us beyond existing fault tolerance methods, such as those that rely on lattice renormalization to deal with defects [6,[45][46][47]. To characterize the robustness of our architecture as a function of p 0 , we perform Monte Carlo simulations of our architecture operating as a quantum memory. We observe a minimum required squeezing of 10.5 dB or a maximum tolerable swap-out probability of p 0 ≈ 0.236; for an experimentally accessible squeezing value of 15 dB [48], our simulations suggest a swapout threshold of p 0 ≈ 0.133, which translates to substantially reduced multiplexing requirements for GKP generation. In part these result stem from a tailored decoding procedures that we present and which allow 2 We cannot replace all modes with squeezed vacua because that would negate the error-correction benefits of the encoded qubits [44]. us to perform fault-tolerant computation on our hybrid resource state.
A key feature of our architecture is that it promises full scalability to a large number of qubits, as required for fault tolerance. Working in one temporal and two spatial dimensions ensures that each mode traverses a path of constant optical depth that is independent of the number of qubits in the computer. This is in contrast to existing schemes for CV computation, where increasing the numbers of qubits requires either longer time delays [13,16], longer measurement integration times, or more precise spectral resolution [17,18]. Such an increase will tend to result in exponentially growing losses so these architectures cannot be scaled indefinitely. Moreover, the components of our architecture can be arranged as a planar graph: each qubit is connected only to a small and constant number of neighboring qubits and the layout of the computational chip consequently requires no 'swaps' or intersecting waveguides. This planar structure not only opens the possibility of scalable fabrication but also allows for preserving the local structure of the noise. An uncorrelated noise structure like the one enabled by our architecture is critical as it allows exploiting the full machinery of fault-tolerance.
Finally, our architecture poses modest experimental requirements as compared to other architectures for photonic quantum computing. This is because the individual modules involved in our architecture are specialized. Consider as an example the challenge of lowloss and fast reconfigurable optical switching in cryogenic conditions [49]. For our architecture, the stategeneration modules are required to be low-loss, but not reconfigurable; the multiplexing modules pose less severe loss requirements and are more easily made programmable; and the computational modules allow for fairly lossy reconfigurable switches. Furthermore, the computational module allows for operation at room tem-perature and without requiring vacuum, thus promising favorable scalability in manufacture via modern lithographic fabrication processes with minimal change. Thus, a hybrid resource state for quantum computing along with its accompanying decoder and a scalable and hardware-friendly architecture that computes with this state are the main results of this work.
The paper is structured as follows. The inset of the second page provides an overview of the main results, and Section 2 provides the necessary background. Following this, the planar architecture is detailed in Section 3, and the method to implement quantum error correction, including the specialized decoder for the hybrid lattice, is presented in Section 4. Section 5 details the fault-tolerant logical-level quantum computation and Section 6 presents the fault-tolerance thresholds for our architecture. We discuss open challenges and technological advantages in Sections 7 and 8.

Background on Quantum Computation Using CV Systems
In this section, starting with Section 2.1, we present the relevant background for our architecture, before which we introduce briefly the field of photonic CV quantum computing, The physical systems that our architecture computes with -modes of light -are infinite dimensional. The generalization from a qubit to a qudit computational model, that is, from two-level to d-level quantum registers is relatively straightforward. But the jump to formally infinite-dimensional systems introduces a few technical complications.
The original CV computational model was proposed in Ref. [50]. In analogy to the qubit and qudit stabilizer formalisms [51][52][53], CV quantum computation also possesses an efficiently simulable sub-theory, Gaussian computation [44]. Unlike in the discrete variable case, how-ever, attempting to encode data in a way that uses the full Hilbert space available to a bosonic mode is physically impossible. This is because infinite-dimensional data registers are extremely sensitive to noise. Since every mode can be expected to be exposed to (at least) weak noise sources, entangled modal states will be corrupted by high-weight errors that are beyond the capabilities of quantum error correction.
Nevertheless, consideration of the CV paradigm has been fruitful on at least two fronts: first, CV cluster states can be generated deterministically, on a large scale, and with constant-depth, local, linear-optical networks; second, the CV state space can house bosonic codes, which are rich families of wave-functions that can be used to encode discrete-variable quantum information with desirable properties such as robustness to decoherence and experimental convenience in the generation of ancillae and the implementation of logic gates and measurements. Using bosonic codes solves the above conundrum: encoded qubits convert high-weight weak noise sources to low-weight qubit-level noise that is compatible with conventional fault-tolerant architectures for quantum computation. The next section provides relevant background on bosonic encodings.

Qubits Encoded Into Bosonic Modes
Bosonic qubit encodings are two-dimensional subspaces within the infinite-dimensional Hilbert space of a bosonic mode. Good choices of this two-dimensional subspace allow for experimentally convenient ways of preparing the encoded qubit states, implementing desired unitary gates, and faithfully performing measurement readout. In some cases, the redundancy of the full infinite-dimensional Hilbert space can even be leveraged to detect and correct CV errors -random Gaussian displacements, rotations, and photon loss, for a few -without destroying the encoded information. Examples of bosonic codes include GKP [20], dual-rail [3,54], cat [55,56], hypercat [57,58], binomial [59], and general rotation-symmetric codes [60].
For reasons we will describe, our architecture exploits the GKP encoding. For notational convenience we restrict our discussion to square-lattice GKP encoding but the results can be generalized to GKP states on other lattices. In their ideal form, the GKP qubit states |0 gkp and |1 gkp are defined as Dirac delta combs with a spacing of 2 √ π in position space [20]: The chief advantage of GKP states is that qubit Clifford operations map to CV Gaussian operations, which can be implemented deterministically and easily using linear optical elements, homodyne detection, and Gaussian states of light [20]. In Appendix B, we provide optical circuits for the application of GKP qubit gates in detail. Briefly, Pauli X and Z gates on GKP qubits correspond to displacements along q and p by √ π, respectively. The Hadamard gate is simply a π/2 rotation in phase space, implementable by a phase-shifter. The qubit phase gate √ Z corresponds to phase-space shear, which can be implemented by a single-mode squeezer sandwiched between two phase-shifters. The qubit CN OT and CZ gates correspond to CX and CZ gates in the CV domain, respectively, which in turn can be broken down into a pair of single-mode squeezers between two beam-splitters. Deterministic all-optical entangling gates are a distinct advantage of the GKP encoding over dual-rail encoding schemes [3]. Moreover, qubit Pauli measurements correspond simply to homodyne measurements, which operate at faster speeds and higher efficiencies than photon-counting detectors [48,61]. Finally, non-Clifford operations require a non-Gaussian resource. Unitary implementations of the T gate can be achieved using a cubic-phase interaction, though this is difficult in practice [62], and does not perform well for finite-energy states [63]. As an alternative, T gates can be performed through gate teleportation by preparation of a GKP magic state [20], which we also review in Appendix B.
Using bosonic codes as the physical qubits for a faulttolerant quantum computing architecture provides two tiers of protection from noise. The first comes from the bosonic code itself. GKP qubits possess a degree of intrinsic robustness to those bosonic noise channels that result in small displacements (relative to the √ π lattice spacing) in phase space. This includes weak levels of photon loss, the dominant error mode for quantum communication [64,65]. Any shift that is less than half the lattice spacing ( √ π/2) can be corrected by non-destructively measuring the GKP stabilizerswhich are 2 √ π shifts in either position or momentum. This CV error-correction procedure outputs continuous syndrome data, which can be used to undo the displacement with the help of a decoder. Noise that leads to larger displacements can result in errors that are undetectable by measuring only the GKP qubit stabilizers. In the fault-tolerant regime, these larger displacements are much less likely to occur, and so occasional errors on GKP qubits can be corrected by applying the second layer of protection: a qubit quantum error-correcting code. Implementing these codes requires only Clifford gates and Pauli measurements, both of which are easy (Gaussian) for the GKP qubit encoding.
An essential part of any quantum error correction procedure is the decoder, which specifies the recovery operation that has to be applied for given syndrome data.
The two-layer structure of the error correction described above requires two stages of decoding. The first of these stages translates continuous GKP-stabilizer syndrome data into the operations required to return to the GKP code subspace housed in each mode, up to qubit-level errors. It can also provide some detailed information about the relative likelihood of different discrete qubitlevel errors [22]. The second stage maps syndrome data obtained from measuring the higher-level qubit-code stabilizers to a qubit-level recovery operation. To avoid confusion, we refer to the former as the inner decoder, and the latter as the outer decoder. Decoders that are tailored for our architecture are described in more detail in Sections 4.2 and 4.3, respectively.
Unfortunately, ideal GKP states are nonnormalizable states with infinite energy.
Many related methods can be used for defining finite-energy versions of these states [30,66]. A common approximation is to replace each delta function in the GKP wave-function with a Gaussian of width ∆, in addition to an overall Gaussian envelope of width 1/∆ that damps the peak weighting further from the origin [20]: where N µ is a normalization constant. While finiteenergy effects will be ever-present in any real implementation, the noise they introduce does not preclude GKP states from being useful for error correction and faulttolerant quantum computation [1,64,67]. Next, we review a method for preparing approximate GKP states using Gaussian resources and PNR detectors. We follow this with a review of how GKP states interface with CV cluster states and qubit quantum errorcorrecting codes.

Generating Bosonic Qubits via GBS
Recent theoretical and experimental breakthroughs have made post-selected schemes via GBS as the most promising candidates for generating non-Gaussian states. GBS devices are capable of preparing highly non-Gaussian states -including GKP qubits -contingent on the observation of a specific detection pattern in the PNR detectors. This preparation of non-Gaussian states of light including single bosonic qubits using GBS devices has been developed in Refs. [27][28][29][30], and we summarize the relevant portions briefly here.
GBS state preparation consists of sending N displaced squeezed vacuum states into a general interferometer on N modes, followed by PNR detectors on N −1 of the modes, as depicted in Fig. 1 [27][28][29][30]. The emitted light from one output port is in a chosen non-Gaussian state subject to obtaining the correct click pattern {ni} at the PNR detectors connected to the remaining output ports. The double purple lines represent classical logic, which is used to trigger a switch on the emitted port. (right) A simplified representation of a single GBS device.
the displacement, squeezing, and inteferometer parameters, as well as the photon number pattern at the PNR detectors, can all be tuned so that the device can herald the desired high-fidelity non-Gaussian output state. This procedure exploits the non-Gaussianity of PNR detectors and can generate arbitrary logical single-qubit states for a variety of bosonic encodings, including the GKP and cat encodings. As the generation of the desired state requires a particular pattern of photon number detection outcomes to be observed, the generation is non-deterministic but heralded.
As a concrete example, consider that small-scale GBS devices made up of 3-mode interferometers, two PNR detectors registering up to 7 photons, and three momentum-squeezed vacuum states with up to 12 dB of squeezing have the potential of producing |0 ∆ gkp GKP states with ∆ 2 = 0.1 (∆ dB = 10 dB , for ∆ = 10 −∆ dB /20 ) with a fidelity of 76% (92%, and 96%) and heralding success probability of 2.1% (0.4% and 0.1%) [30]. We see here a fidelity-probability trade-off for a fixed number of modes. Comparable results are observed for the preparation of finite-energy GKP magic states.
Although high-fidelity state generation from a single GBS device is non-deterministic, these sources can be multiplexed to obtain higher rates of generation, as we detail in Section 3.1 for our architecture. Additional hardware resources, both in the individual GBS devices and in the multiplexing, can be used to increase the rates and fidelities of the generated states.

Measurement-Based Quantum Computing With Canonical CV Cluster States and GKP Qubits
Modes of light are not well suited to serving as stationary quantum data registers. Optical modes can interact with only a few optical elements before they must be measured or else lost. Fortunately, this constraint is compatible with the measurement-based model for quantum computing, where each quantum data register is entangled to a constant number of others, and then measured at a detector-the entire computation being specified by which measurements are chosen. Here we review relevant details and terminology on CV photonic measurement-based quantum computation.
Measurement-based quantum computation in the qubit setting involves preparing an entangled resource state (most commonly, a cluster state [68]), and performing a sequence of single-site adaptive measurements [69]. These cluster states are specified by a graph; for each node a qubit is prepared in the |+ state and for each edge a CZ gate is applied. Cluster states are said to be universal resources if they enable universal quantum computation when given access to adaptive single-site measurements.
This paradigm can be generalized to the CV degrees of freedom present in a bosonic mode: CV cluster states are Gaussian entangled states that enable CV measurement-based quantum computation via local measurements [8]. The simplest of these are referred to as canonical CV cluster states [70], which are constructed by applying controlled-Z gates e iq⊗q to momentum-squeezed vacuum states. CV cluster states with any graph can be generated on demand since both the controlled-Z gates, and the preparation of momentum-squeezed vacuum states can be implemented deterministically.
Ideal CV cluster states cannot be normalized and correspond to unphysical infinite-energy states. In a similar way to the GKP qubit, the approximate nature of physical CV cluster states can be captured by a finitewidth Gaussian envelope structure of the state's position space wavefunction: where V is a real symmetric adjacency matrix corresponding to the cluster state's graph and /2 is the variance of each momentum-squeezed vacuum state in the momentum quadrature. The case of → 0 corresponds to the infinite squeezing limit. Note that the CV controlled-Z gate e iq⊗q is common to both the GKP qubit encoding and canonical CV cluster state generation. Therefore it is possible to generate a hybrid cluster state with nodes comprising momentum-squeezed states, GKP qubits, and their common CZ gates [1]. As we detail in the results sections, this is one of the key concepts that enables com-putation with our hybrid resource state.

Cluster States and Fault Tolerance
That quantum information can be processed reliably in the presence of noise is key achievement of quantum error correction [71,72]. Given a logical circuit to be implemented, the idea is to redundantly encode the information content of the logical qubits into larger collections of physical qubits and perform computation on these collections. If physical qubits of sufficient quality are available and if sufficiently precise operations and measurements can be performed on these qubits such that the noise strength remains below the threshold of the specific code used, then the logical quantum circuit can, in principle, be applied with arbitrarily high precision [73][74][75]. The study of these thresholds and the physical-qubit overheads associated with different errorcorrecting codes is an active area of research.
In practice, it is often desirable to choose a quantum error-correcting code capable of tolerating a high error rate and not requiring long-ranged connectivity between physical qubits. The surface code [76,77] is a commonly used code because it has both these properties, enjoying a high threshold of ∼ 1% [78][79][80] and only needing nearest-neighbor connectivity of qubits in two spatial dimensions. In addition, it is highly resistant to erasure errors, even up to 50% qubit loss [45]. However, optical quantum computing architectures are better suited to measurement-based implementations of quantum error-correcting codes. In the case of the so-called Calderbank-Shor-Steane (CSS) codes [81,82], this can be implemented using cluster states corresponding to foliated (i.e., layered) lattice sheets that implement CSS code stabilizer measurement gadgets [83].
Perhaps the best studied cluster-state based errorcorrecting code is the RHG lattice. In the foliated picture, it can be thought of as alternating so-called primal and dual sheets of 2D cluster states that encode the surface code. This special topology is what leads to fault tolerance: error detection and decoding involve multiple mutually entangled layers at a time. The RHG lattice serves as a good first candidate to study for our architecture because not only is it universal for MBQC, but it has high fault-tolerant computational error thresholds ( 1%). A full fault-tolerant computation can be performed on the RHG lattice with the help of the RHG scheme [40][41][42]84]. A schematic of the RHG lattice is presented in Fig. 2. The details of the full computation on this lattice-state initialization, gate application, measurement, and error correction-will be reviewed and adapted to our architecture in Sections 4 and 5.
Other lattices such as non-foliated lattices [85][86][87] or non-CSS codes [88] can also be considered, including possibly new lattice designs tailored to errors in CV quantum computing.

Bosonic Codes Concatenated With Topological Codes for Quantum Computation
The discussion in the previous section deals with cluster states composed of qubits, which can include qubits encoded in bosonic modes. This two-layer encoding can be seen as a bosonic code concatenated with a topological code, a subject of growing interest [7,22,23,[89][90][91][92].
The main motivation here is that CV errors larger than those the inner bosonic code can handle are picked up and corrected by the outer qubit code.
A gate-based model for the concatenated GKPsurface code catering to a superconducting platform was considered in Refs. [7,89,91]. In these works, each GKP qubit in the surface code first undergoes a round of GKP error correction, after which the stabilizer measurements for the surface code are performed. The noise considered in Ref. [89,91] is a Gaussian classical noise channel applied to all GKP qubits and prior to all homodyne measurements in both the GKP error correction and the parity check measurements. Ref. [7] considers the same noise model but applied now to the circuit level, and explicitly includes error feedback. An improved threshold for the surface-GKP code was obtained in Ref. [90] by designing bias in the noise.
Measurement-based topological quantum computation using GKP qubits, compatible with a photonic architecture, was considered in Refs. [22,23]. The approach was to generate a 3D cluster state to implement a measurement-based analogue of the surface code using GKP qubits. The cluster state was generated using a post-selection (fusion-based) approach that is nondeterministic by nature. This work also introduced an analogue quantum-error-correction scheme, where the real-valued measurement outcome from the homodyne measurement was explicitly used in the decoding procedure. The errors considered in Ref. [22] are finite squeezing effects in the GKP state preparation along with a Gaussian random displacement noise. More realistic noise, such as loss in the entangling gates and the homodyne measurements, was also considered in Ref. [23]. Furthermore, the GKP state preparation is considered to be deterministic and the final state generated is a connected cluster populated only by GKP states.

An Architecture for Photonic Quantum Computing With Hybrid Resource States
This section describes the different components of our architecture. The architecture has information encoded in GKP qubits because of the advantages previously discussed. The encoded information is processed in a measurement-based setting using a hybrid resource state comprising GKP qubits and squeezed states using the components that we now describe.
The architecture comprises four modules: three modules together generate a computational resource state in one temporal and two spatial dimensions using a planar photonic chip and the final module performs measurement-based quantum computing on this state. First, the state-preparation module generates highquality GKP qubits, albeit with low probability. The multiplexing module boosts the qubit generation rates and, in the event of a qubit generation failure, substitutes in a momentum-squeezed vacuum mode. The computational module implements the deterministic entangling operations, thereby enabling universal and fault-tolerant measurement-based quantum computation. The final module, the photonic quantum processing unit or photonic QPU performs homodyne measurements on the generated resource state in order to perform the required computation. We note that the first two modules are entirely dedicated to the preparation of single-mode states; entanglement and measurement are relegated to the third and fourth modules.
The resulting quantum state is a hybrid resource, . . . Figure 3: Multiplexed state generation. Multiplexed GBS devices for increased rate of state preparation. The multiplexer consists of a binary tree of 2 × 2 switches that either implements an identity or SWAP gate on each optical mode, moving a successfully generated GKP state to the correct output port. If no GBS device produces a GKP state, we swap the output of the multiplexing device for a deterministically generated momentumsqueezed state (depicted by the ellipse on bottom left). The right-hand side shows the simplified diagram for the hybrid quantum light source. Note that the classical information wire is suppressed.

MUX ≡
made up of both GKP qubits -Pauli eigenstates and magic states -and squeezed-state (or CV) modes. This structure is compatible with the probabilistic nature of the encoded qubit sources; given access to a hypothetical deterministic sources, a resource state could be constructed entirely out of GKP qubits. Though probabilistic sources of qubits can be treated through heralded erasure errors (modelled as the application of a maximal-strength depolarizing channel), we find a better strategy, namely to use readily available momentumsqueezed vacuum states as a substitute for missing GKP states. We call this replacement a 'swap-out'. Momentum-squeezed states preserve the entanglement structure provided by the CZ gates and do not introduce as much noise as qubit level erasure channels. We describe the multiplexed state generation in Section 3.1, the computational module in Section 3.2, and the photonic QPU in Section 3.3.

Multiplexed GBS Devices for High-Probability GKP Generation
GBS devices (see Fig. 1) can be exploited as probabilistic sources of GKP qubits, as reviewed in Section 2.2. The success probability of these sources can be boosted at the cost of increased overhead by using multiplexing, i.e., redundantly running multiple sources in parallel. For a fixed required fidelity, the generation rate for GKP states can be boosted to arbitrary desired probability values 1 − p 0 by using spatial multiplexing, as illustrated in Fig. 3. Specifically, if a single GBS device can prepare a GKP qubit with probability p GBS , we need that where N GBS is the number of GBS devices. Multiplexing requires active feed-forwarding of PNR detector outcomes, which can be implemented using 2 × 2 "crossbar" switches. These switches operate in two modes: a bar that effects the identity and a cross that effects the SWAP gate. This operation can be realised through variable-transmissivity beam-splitters, or Mach-Zehnder interferometers with variable phaseshifters. A binary tree of these kinds of switches is sufficient to move a successfully prepared state into the correct output port [93].
If D is the depth of this tree, then the total number of switches is given by For a fixed number of GBS devices N GBS , the depth should be In the event that no GBS device successfully produces a GKP state, we include an additional switch at the output of the multiplexing device which can swap in a momentum-squeezed state to replace the output of the multiplexed GBS devices, as shown in Fig. 3. While the GBS devices can be operated at high repetition rates by designing sufficiently high-bandwidth squeezers, the maximum repetition rate experienced by state-of-the-art photon number resolving detectors based on transition-edge sensors (TESs) is likely limited to a few MHz. While other detectors based on superconducting nano-wires are available, these do not meet the number-resolving capabilities required for the suitable operation of the GBS device. To boost the acceptable operating clock frequency, i.e., the pulse repetition rate, time-to-space demultiplexers can be used to step down the repetition rate seen by each PNR channel from that used to pump the GBS devices [94]. For example, to step down a 1024 MHz pulse train to 8 MHz, each 128 clock cycles on one output are mapped to 128 separate spatial outputs; this requires 7 layers in a binary tree configuration, with 127 active elements. As the TES readout and data acquisition process takes some time, optical delay lines can be used to buffer the heralded outputs of the GBS device before they reach the spaceto-space multiplexers, to provide enough time for the multiplexer routing configuration to be actuated. Assuming this readout time is limited by the rise time of the TES output voltage traces, a few nanoseconds of buffer delay can be used, corresponding to several metres of low-loss optical fiber or integrated waveguide length. An alternative to demultiplexing is to interleave pulses arriving from multiple GBS devices, which are operating at the timescale of the TES detectors. Since the light pulses themselves are short enough, pulses arriving from k different devices can be switched into a single spatial mode but at different arrival times that are separated by the faster clock cycle rates. This alternative does away with the requirement of fast fedforward switches at the cost of introducing more squeezers and linear optical elements in GBS devices. Either of these alternatives or a combination thereof can be chosen based on the actual hardware considerations. In a nutshell, the multiplexing module is responsible for the boosting of GKP-generation probabilities and for stepping up the generation rates from the PNR speeds to the computational clock speeds.
(2)] with probability 1 − p 0 and momentum-squeezed vacuum states with probability p 0 . The outputs of the multiplexed state-generation module are fed into the computational module, which we now describe.
The first step in the generation of the resource states is to create one-dimensional hybrid cluster states that extend in the temporal direction. Recall from Sections 2.1 and 2.3 that both GKP qubit cluster states and CV cluster states require CZ = e iq⊗q gates. Given a sin-gle physical CZ gate (implemented via beam-splitters and squeezers as shown in Fig. 16), a linear cluster state can be generated in the time domain using optical delay lines [95][96][97]. Fig. 4a depicts the setup for the generation of this 1D cluster state. The circuit receives as input the states generated using GBS state preparation. The first mode is swapped into the optical delay line, whose length is set equal to the distance between subsequent optical pulses. This mode returns to the interferometer and interacts with the next mode at the CZ gate. At the final step of operation, the cycling light is kicked out of the delay line by the same physical swap gate. This interaction repeats for each of the incoming modes. In this way, a one-dimensional cluster state is generated.
In terms of on-chip implementations, this clusterstate generation involves a pair of fast actively switchable beam-splitters, controllable phase shifters, a delay line, and inline squeezers. The last of these requirements can be eliminated, in principle, by moving to a macro-node approach (see Section 7). The delay line is set to one clock period, and is required to be phase stable; therefore, integrated implementations of this module are preferable. We also note that the clock speeds are ultimately limited by the speeds of the final detections; in our case, these are homodyne detections, which can be faster than other photonic and non-photonic platforms for quantum computation.
Next, additional CZ gates are implemented in the two spatial dimensions to generate the 3D structure of the RHG lattice. Consider a 2D spatial array of 1D time-domain cluster state sources, interspersed by additional state-preparation modules and connected in the spatial domain by a nearest-neighbor array of optical CZ gates, as shown in Fig. 5a. These extra statepreparation modules are broken into two sets, indicated by the green and yellow coloring in Fig. 5a. Half emit states at even clock cycles, and the other half emit at odd. The CZ gates are also divided into two setsindicated by green and yellow coloring in Fig. 5a-and are applied during even and odd clock cycles, respectively. Thus, the additional spatial connectivity of the lattice for even and odd clock cycles is as shown in Fig. 5b. The resulting cluster state has a lattice structure, as shown in Fig. 5c in (2+1)-dimensions. After traversing through the CZ gates, all modes are sent to homodyne detectors.

Measurement-Based Quantum Computation With a Photonic QPU
The actual computation is performed in the photonic quantum processing unit (QPU), which includes an array of homodyne detector cells and fast classical control. By merely changing the phases of the local oscillator, the QPU can correct the errors that are detected by the decoders (Section 4) and perform the logical computation (Section 5). The homodyne cells implement quadrature measurements on each of the modes at each clock cycle, with the measured quadrature angle controlled by the localoscillator (LO) phase 3 . In commonly employed implementations, each homodyne detector cell consists of a vertical coupler, an LO channel, a fast phase shifter, a 50/50 beam-splitter, and a pair of photodiodes. This functionality can all be accomplished on a planar silicon photonic chip, e.g. using Silicon-Germanium for the photodiodes and standard silicon-photonic phase modulator. We note that modulator loss is not important here, since the phase-shifter acts on classical LO light. The photo-currents are subtracted in order to implement balanced detection and suppress LO noise. Each cell of the homodyne layer sends its corresponding photocurrent difference output to an electronic quadrature discriminator layer, which amplifies (via an integrated transimpedance amplifier) and digitizes (via an analogto-digital converter) the signal, extracting a value for the quadrature measurement. These readout values are sent as inputs to the classical QPU controller.
The QPU controller is a fully classical digitalelectronic system responsible for calculating each set of quadrature angles to be implemented on the subsequent clock cycle. This calculation takes as its inputs the quadrature measurement readout values from the photonic QPU on the previous clock cycle, the input state record from the state generation module, and the program instructions (encoding the user's compiled quantum program). After the subsequent clock cycle's quadrature settings are calculated, the information is passed to the photonic QPU to actuate the LO phaseshifters before the arrival of the pulses in the subsequent clock cycle. The QPU controller also records the results of the computation by storing the quadrature readout values in (classical) memory, to be decoded and passed back to the user. Though fully classical, the performance requirements of the QPU controller are likely to be substantial, which motivates the development of dedicated digital electronic application-specific integrated circuits operating at very high clock speeds. steps required to implement quantum error correction for a quantum memory.
Algorithm 1: Quantum error correction procedure for a quantum memory 1. Initialization. Prepare a resource state on N quantum modes corresponding to the nodes of the RHG lattice. With probability 1 − p 0 and p 0 , the state of each node is either a noisy GKP state or a finitely-squeezed momentum eigenstate, respectively. Both node states are characterized by a noise variance parameter δ.

2.
Measurement. Obtain a list of real-valued outcomes corresponding to p-homodyne measurements on all the modes.
3. Inner decoder. Map the real-valued homodyne outcomes to binary qubit measurement outcomes using local and global information via Algorithm 3.

4.
Outer decoder. Apply qubit decoding techniques for the RHG lattice such as those in Algorithm 4 to obtain a recovery operation which has a corresponding CV implementation.
5. Error correction. Perform CV feed-forward operations based on the outcomes obtained and processed in steps 2 and 3. These, combined with the qubit recovery operation obtained in step 4, return the complete CV recovery operation, which can be tracked in software.
At a high level, quantum error correction in the architecture consists of performing homodyne measurements on a subset of nodes of the RHG lattice, followed by processing of the measurement data to output a recovery operation to be applied on the remaining active nodes of the lattice. In our case, the data processing procedure consists of two decoders, the first of which is an inner (CV) decoder that converts the real-valued homodyne measurements into qubit outcomes and probabilities of Z-type qubit-level errors. This information, in turn, is fed into an outer (qubit-level) decoder, which returns an outer recovery operation. As described below, our outer recovery operation can exploit analog information from the inner decoder, resulting in suitable inner recovery operation to be applied on the physical modes of the system.
Thus, the full error correction procedure is specified by the choice of inner decoder (applied to the GKP code) and the outer qubit code (applied to the RHG lattice). We first introduce a noise model for our hybrid lattice in the next subsection. The inner and the outer decoders are tailored to both the noise model and the hybrid GKP/squeezed-state structure of our architecture. The step-wise procedure for implementing the quantum error correction procedure on a quantum memory is overviewed in Algorithm 1.

Error Model
In order to motivate our choice of inner decoder and check its efficacy, we first construct and analyze a simple noise model for our hybrid RHG lattice. This section summarizes the noise model and main conclusions that we draw from it, with full details available in Appendix A.
A reasonable model, which is standard in the CV literature, for capturing part of the noise effect of finiteenergy GKP states is obtained by the application of a Gaussian noise channel [98] to the ideal GKP states with noise of variance δ 2 [1,7] in both quadratures, with δ = ∆ 2 from Eq. (2). While this noise model does not capture the peak-damping envelope of Eq. (2), it captures the finite width added to each delta-function in phase space. In our case, we find that the same noise model framework can be used to model the replacement of |+ gkp states with p-squeezed states, setting = ∆ 2 = δ from Eq. (3). In particular, we notice that adding Gaussian noise of variance δ/2 ( 1 2δ ) in p (q) quadrature makes the |+ gkp mimic the Wigner function of a mixture of p-squeezed states. In the context of Eq. (6), the noise matrices for GKP and p-squeezed states are given by: More concretely, the |+ gkp Wigner function has rows of positive peaks periodically arranged in phase space along even integer multiples of √ π in the p quadrature, and alternating positive and negative peaks for odd multiples (see Fig. 1a of [99]). The broad distribution in q from Y p causes the rows of positive and negative peaks to cancel, and the rows of positive-only peaks to add, washing away the Wigner negativity and yielding a distribution mimicking a mixture of p squeezed states spaced by even multiples of √ π in p. While this is not a true p-squeezed state, we do not expect it to provide an underestimation of the error probability of the quantum memory, especially since a mixture of states (as opposed to a pure p-squeezed state) would only add more noise and hence make the decoding problem more difficult.
Given these two types of initial states, both modelled as GKP states having undergone independent and different Gaussian noise channels, we then model the encoding into the RHG lattice, which simply consists of repeated applications of CZ gates. Propagating the initial state noise through the CZ gates results in a correlated Gaussian noise channel, where the correlations depend on the locations of p-squeezed states and on the lattice-dependent pattern of CZ gates applied to the nodes. We assume that the dominant source of noise is the noise in the input states. Additional noise sources include photon loss and noise introduced in CZ gates, which we leave to analyze or improve in future work.
From our model, we can formally write down the distribution of p-homodyne data. Since all the modes are measured in the p-quadrature when the computer is operating as a quantum memory, we can use this model for the distribution to inform our choice of inner decoder. In the case of no initial-state noise, sampling from the distribution of p-homodyne outcomes would simply correspond to sampling a lattice point n √ π in p-space, where n is dictated by the qubit state of the RHG lattice. However, under the correlated Gaussian noise channel in our model, we find that each lattice point in p-space is converted into a correlated Gaussian distribution centered at the same point with covariance matrix Σ p . Here, Σ p is the momentum part of the covariance matrix for the Gaussian peaks of the Wigner function in the phase space for the state of our hybrid lattice, as we show in Appendix A. Σ p contains the aforementioned correlations and can be used to our advantage in the inner decoder as we show in the next section.

Inner Decoder
As described above, an inner decoder T is a function that takes real-valued homodyne data and outputs binary data interpreted as qubit measurement outcomes, i.e., These qubit outcomes can then be combined into stabilizer measurement outcomes and used in the subsequent decoding procedure of the outer code [42]. Additionally, we use our model for noise and the inner decoder strategy to calculate (marginal or correlated) probabilities of qubit error in our readout, which in turn can then be used to inform our outer decoder strategy that we outline in the following subsection. The standard map from homodyne measurement outcomes to qubit measurement outcomes is a binning function derived from the translational symmetry of the original GKP state, i.e., the perfect periodicity in the q and p directions. 1. Apply the following change-of-basis T p = p where: We note the column vectors of this transformation are eigenvectors of Σ p in this case.
2. Bin the first component of p to the nearest integer multiple of √ π to return n 0 √ π, since the p quadrature outcome of mode 0 is uncorrelated from the others.
3. Of the last three components of p , find the component i that is closest to an integer multiple n i of √ π. Round p i to n i √ π. We only choose the last three components since we do not trust the second component which corresponds to homodyne results along (0, 1, 1, 1, 1) which has excessive noise of order 1 2δ . 4. If n i is even (odd), round the remaining two components other than p 0 , p 1 and p i to the nearest even (odd) integer multiples of √ π for each component. This yields √ πv = √ π(n 0 , p 1 / √ π, n 2 , n 3 , n 4 ), because on applying the change of basis T to an integer vector, the last four components of the new vector should either all be even or all odd.
5. If (n 2 + n 3 + n 4 ) mod 4 = 0, 1, 2, 3, then round p 1 to the nearest n 1 √ π with the constraint that n 1 mod 4 = 0, 3, 2, 1. This yields √ πn = √ π(n 0 , n 1 , n 2 , n 3 , n 4 ). Again, this is because on applying the change of basis T to an integer vector, the second component and the last three components respect this rule, so this guess should respect it too. 6. Undo the change of basis on the integer-valued vector T −1 n = n.
7. Take n mod 2 = s to be the five-component binary string output.

Algorithm 3: Inner decoder
Input: Vector p = (p 0 , ..., p N ) of homodyne measurement outcomes, with p i ∈ R, and the noise model.
1. Identify directions that are noisy and those that are not using the noise matrix.
2. Perform a suitable change of basis to the homodyne data to obtain CV results for joint quadratures, a smaller number of which have reduced noise. In particular, an integer-valued transformation would allow for certain consistency checks (e.g. parity) when making a guess for the p-space lattice point n.
3. Apply binning along the new directions to round results to nearest ideal peak position, taking into account self-consistency of the results.
4. Undo the change of basis to return a candidate lattice point n √ π.
5. Obtain a binary string by taking n mod 2.

Output: Interpreted qubit measurement outcome
The |+ gkp and |− gkp states are each 2 √ π-periodic in momentum but shifted relative to each other by √ π. Therefore we can place the homodyne outcomes into bins of width √ π that are centred at integer multiples of √ π, associating with |+ gkp (|− gkp ) the outcomes that fell in bins centered about even (odd) integer multiples of √ π. We refer to this procedure as "standard binning". While this binning procedure uses the original symmetry of the GKP states, it does not account for the correlations in the covariance matrix introduced by the CZ gates and the presence of p-squeezed states, as described in the error model.
As a key proof-of-concept improvement to illustrate the importance of taking correlations into account, consider the example of a momentum-squeezed state at the centre of a primal face of the RHG lattice, which we denote as node 0, surrounded by four neighboring GKP states on nodes 1-4. For simplicity, in this example we assume that all the continuous-variable CZ gates are the same, but this trivially generalizes if the signs of the CZ gates change. The joint quadrature p 0 + 4 j=1 q j has a large variance on the order of 1 2δ . Without using the correlations, the naïve inner decoder described above would result in a high-strength dephasing channel on the four neighboring GKP qubits, since the marginal distributions along p j would be broadened by 1 2δ and standard binning does not leverage correlations between nodes. On the other hand, by taking correlations into account, the high covariance along the joint quadrature will result in either the identity gate, or a correlated four-body ring of Z operators on the neighboring qubits, which acts trivially on the code space.
More explicitly, consider the binning strategy that makes use of the correlations between optical modes. The momentum part of the noise matrix resulting from the application of the CZ gates is where we label the modes 0, 1, . . . , 4 with the momentum state corresponding to mode 0. We see that the noise matrix is non-diagonal, i.e., the CV noise is correlated, but it has a specific structure that can be exploited. Two immediate observations are that mode 0 is uncorrelated from the other modes, meaning we can simply apply standard binning to it; and that there is correlated noise along the direction (0, 1, 1, 1, 1) in pspace. Algorithm 2 presents a strategy for dealing with this correlated noise, taking into account consistency checks that our guesses for modes 1-4 must respect.
In general, the problem of finding a better inner decoder for our hybrid architecture is to find a decoder that takes into account the location of GKP and psqueezed states, and knowledge of the structured CZ gates that have been applied to form the cluster state.
The distribution of p-homodyne outcomes consists of a periodic arrangement of Gaussian distributions all with covariance Σ p on N modes, each Gaussian centred at a point n √ π where n are integer valued vectors from a set that corresponds to the ideal state of the qubits. Suppose we obtain the values p after the homodyne measurements. If we assume p could have resulted from a Gaussian distribution centered at any of the lattice points n, then the so-called responsibility [100] of a given lattice point for the result p is given by: The responsibility is directly related to the Gaussian distributions at each lattice point and provides a relative way of ordering which lattice points were most likely to have generated p. Specifically, the lattice point which was most likely to have produced the point p is: where we have chosen the subscript IQP to indicate that this is an integer quadratic program, i.e., a minimization of a quadratic function over an integer domain. As mentioned above and for simplicity, we are using the standard approximation that all peaks in the GKP state have equal weight [1]. However, one could also include an envelope that weights peaks differently, in which case this information could also be included in the calculation of the responsibility. In general, integer quadratic programs are NP-hard [101,102], so we will require a heuristic strategy that is computationally tractable. Our approach for a generalized version of Algorithm 2 is summarized in Algorithm 3, with the case of more complicated configurations of p-squeezed states left to future study.
A few comments are in order. The weights of the matching graph edges in Algorithm 4 are derived from the homodyne measurement outcomes, as well as the positions of the p-squeezed states in the lattice. An example of such weights is presented in Section 6. Furthermore, using the homodyne measurement outcomes to calculate matching graph weights has been explored in the context of the toric code [7,89], but the knowledge of the locations of the p-squeezed states gives us additional information that can be used to improve the performance of the decoder. We discuss this point in more detail in Section 6.
As mentioned earlier, due to the measurement-based computation model, feed-forward operations based on the outcomes obtained from the homodyne measurements and the inner decoder are combined with the qubit-recovery operation obtained from the outer decoder. Together, these inform the complete CV recovery operation that needs to be applied to the active computation layers. In practice, the combined recovery operation need not actually be applied on the qubits; instead, we would keep track of the recovery operations in classical control programming by updating the Pauli frame [80,124].

Fault-Tolerant Universal Quantum Computation
The hybrid architecture from Section 3 allows for the generation of a (2+1)D cluster state suitable for performing scalable quantum computation. For completeness, we present a possible scheme for the fault-tolerant implementation of logical algorithms on such a state. The scheme we choose and review here is based on lattice surgery for the surface code [125][126][127][128], particularly its measurement-based version [88,129]. Note that while the schemes considered here have lower overheads [127] than other surface-code-based approaches, they are heuristics, since finding an optimal implementation is NP-hard [130]. This section overviews the fundamental components needed to perform fault-tolerant quantum information processing in our architecture. We use as examples codes of distance 5, the distance being the weight of the smallest representative of a logical operator. Inner (GKP) code states will be denoted by a subscript, whereas outer (RHG) code states will have a bar. The reader may refer to Fig. 2 for clar- Mode belonging to a different patch Q-measured mode, all sheets P-measured mode, primal and dual sheets P-measured mode, primal sheets P-measured mode, dual sheets representativē X representativē Z Figure 6: Top view of the measurement pattern defining a patch encoding two logical qubits with distance 5. The black nodes, which are measured in the q basis to disentangle them from the cluster, define a patch consisting of the red, green and blue nodes. The grey nodes either belong to other patches or are measured in the q basis as well. The red, blue, and green nodes are eventually all measured in the p basis. The red nodes are present in both primal and dual sheets of the foliated surface code, while the blue (green) ones, are only present in the primal (dual) sheet. The red nodes represent data qubits, while the blue and green ones are ancillas. Representatives of the Pauli logical operatorsX (Z) are highlighted by shaded (lighter unshaded) grey boxes. They consist of strings of Pauli X's or Z's applied to the circled data qubits. The numbers next to the boxes identify the affected logical qubit. For simplicity, the Pauli operators shown above are applied on a primal sheet. For dual sheets, they would be conjugated by Hadamard gates.
ity on the terms primal, dual, rough, and smooth. We use the data and ancilla qubit nomenclature from the foliated picture [83] where useful; see Fig. 6 for more details. While much of this section is meant as a review for a photonics audience and might be familiar to an expert in fault tolerance, it does highlight specific points about how these logical operations could be implemented in our architecture.
Logical Qubits. In lattice surgery, quantum information is encoded by way of patches. After the chip produces a large RHG lattice, homodyne measurements in the q quadrature disentangle certain nodes from the lattice, effectively creating holes. We refer to these qmeasured regions as gaps, and we define a patch as a continuous part of the lattice completely surrounded in the spatial dimensions by a gap 4 . Fig. 6 shows an ex- 4 The gap region is referred as the vacuum [41], but we forgo this terminology to avoid confusion with the more common notion of vacuum. Note also that the patches we consider here do not ample of a patch and how two logical qubits are encoded with it. A patch can be deformed to move around the logical information and minimize overheads. Patch deformations can be achieved by changing the q homodyne measurements to p in the gap surrounding the patch (without connecting it to other patches), followed by a sufficient number of rounds of error correction.
State Initialization. The state of a single logical qubit can be initialized in either |+ or |0 by measuring the ancillary qubits of the first temporal layer in the p or q quadrature, respectively, while the rest of the measurements are all performed in the p quadrature. The initialization can be performed fault-tolerantly by measuring a number of layers that scales linearly in the code distance and then performing error correction as described in Section 4. Alternatively, the error correction can be performed during a subsequent patch deformation [128]; the latter is more resource-efficient, and is thus the approach we favour.

LogicalZ andX Operators and Measurements.
LogicalZ (X) gates in the code are effected through chains of physical Z (X) operations connecting the appropriate borders, that is, √ π displacements along the p (q) quadrature, in a primal sheet. The Z and X physical operations are reversed when applied on a dual sheet. While we do not apply logical operators in that manner in this work, they are helpful to understand multi-qubit Pauli measurements.
Destructive logical Pauli measurements can be effected through homodyne measurements on the active layer of data qubits of the patch. Measuring a primal sheet in the q (p) quadrature will result in a logical Z (X) measurement; for dual sheets the measurement basis is swapped. The actual measurement outcome depends on the parity of the results along the corresponding chain from the previous paragraph. This procedure can be viewed as the inverse of state initialization. For non-destructive logical Pauli measurements, a set of ancilla qubits can be coupled to a single data qubit each, followed by homodyne measurements in the p basis. Once again, the clock cycle at which the data and ancilla qubits are coupled determines whetherZ or X is measured. Performing error correction is required to reliably infer the logical measurement outcome.

Multi-Qubit Operations.
Merging and splitting different patches allows one to perform logical entangling gates and multi-qubit Pauli measurements. Merging (splitting) is achieved by changing the measurement pattern from q to p (p to q) in the gap between patches, contain defects, that is, internal gaps.  Figure 7: Two neighboring patches before merging (or after splitting) in (a); and the patches after merging (or before splitting) in (b). The legend used is the same as in Fig. 6. Measurements of the nodes in between two patches in (a) are switched from the q quadrature to p, resulting in the measurement of the new surface code stabilizers. The merged patch encodes a single logical qubit. After five rounds of error correction, corresponding to the code distance, the measurement outcomes of the logical operator X1 ⊗ X2 can be inferred, as indicated by the shaded boxes shown in (b), which corresponds to the product of the new stabilizers, shown in yellow. A single logical qubit remains encoded in the patch after the merging process. The splitting of the patch in (b) is achieved by measuring the data nodes in between the two patches in the p quadrature and the ancillas in the q quadrature for a number of rounds of error correction sufficient enough for allowing fault-tolerance, after which the data nodes can also be measured in the q quadrature. After the splitting process, the logical state of the system transforms according to α|0 + β|1 → α|00 + β|11 . as illustrated in Fig. 7. As in the case of deformation, error correction must be performed after the merging for a fault-tolerant implementation 5 .
An adjacent ancillary patch can be used to measure the tensor product of a Pauli operator P associated with logical qubits living on different patches. This ancilla patch does not encode logical information; it simply consists of a tensor product of physical |+ states. In order to perform the measurement, the ancilla patch is merged via the relevant boundaries of the patches encoding the logical qubits. Measuring the additional stabilizers along the concerned boundaries associated with P in the ancilla patch and performing error correction allows one to fault-tolerantly measure P. A specific example of this process is illustrated in Fig. 8.

Magic State Injection and Distillation.
Homodyne detection on GKP qubits alone does not give us access to the non-Pauli measurements used in the original proposal [41]. It is therefore necessary for us to inject physical magic states |m = |0 gkp +e i π 4 |1 gkp created by the GKP factories into the RHG lattice. This process -illustrated in Fig. 9 -requires some classical post-selection (which can be performed in parallel to ensure a high probability of success) and a subsequent d rounds of error correction [132]. In our case, the error correction is performed after the patch merging step of the multi-qubit Pauli measurement, as described in the previous paragraph.
One can distill multiple copies of noisy magic states into a single high-fidelity state. This process is necessary in our architecture in order to obtain an encoded magic state with low enough noise to be able to implement logical gates. Several different magic state distillation procedures exist [133][134][135][136][137][138]. The details of the noise afflicting the hardware as well as the logical algorithm specified by the user will inform the preferred choice.

Running a Logical Quantum Computation.
Before running the algorithm on the hardware, it ought to be compiled in such a way that all logical Clifford operations are commuted through to the end of the circuit and absorbed into the logical measurements. This turns the single logical Pauli measurements into a sequence of multi-qubit Pauli measurements.
The remaining non-Clifford multi-qubit rotations are performed by consuming a (logical) magic state [127,140], as shown in Fig. 10. Since logical T-gates are 5 For the case of patch merging, it is sometimes useful to introduce twist defects in the process [126,131]. This allows for the measurement of a multi-qubit logical Pauli operator involving a logical Y , alleviating the cost of distilling the eigenstates of the Pauli-Y operator |y to implement logical P gates with the surface code [80,104].  Fig. 6 for the legend. In (a), an ancilla region is prepared by initializing the data nodes into a tensor product of |+ states (on a primal sheet) by measuring the ancilla nodes in the q quadrature. A dislocation region, within the ancillas, is aligned with the meeting point of the rough and smooth boundaries of the two-qubit patch. This specific geometry is chosen so that the product of stabilizers, identified by yellow nodes, gives the desired product of logical operators, represented by the grey boxes in (b). Note that aȲ2 representative has support on the qubits in both the shaded and unshaded horizontal boxes. After five rounds of error correction (5 being the distance of the code), the measurement outcomes can be inferred from the measurement results. not native to the surface code, a high-quality logical magic state must be injected into the computation. Magic state distillation [133] can be used to produce a high quality encoded magic state starting from several lower quality ones. As the distillation is resource intensive, several algorithms for the minimization of the number of non-Clifford operations can be used during compilation [141][142][143][144][145][146][147]. Running a compiled logical quantum algorithm essentially decomposes into two stages. First, the non-Clifford logical rotations are implemented. Each one of these consumes a distilled logical magic state and requires the measurement of a multiqubit logical Pauli operator and of single-qubit logical X operators. Depending on the measurement outcomes, some additional Pauli operations may need to be implemented. These can be commuted to the end of the circuit, in a procedure known as modifying the Pauli frame [124]. Second, once all the non-Clifford operations have been performed, the multi-qubit logical Pauli measurements are performed.
Compatibility With Hybrid Architecture. Our proposed architecture is well-suited to accommodate modifications to these general steps. The different layouts for the magic state factories, the data and the ancilla blocks, and the boundary conditions of the data patches can be easily modified by an appropriate selection of the homodyne measurement quadratures. Note that as long as the fraction of swap-outs is below a critical value (which depends on the squeezing ∆), the general structure of our computing scheme carries through in the presence of swap-outs: both the GKP states in the lattice and p-squeezed states encode |+ and act appropriately under the physical operations we discussed [43]. However, the swapped-out nodes add correlated noise to their neighborhood, an effect we deal with in the error-correction procedure, which we address in the following section.

Threshold Estimation for a Quantum Memory
The operation of the architecture as a fault-tolerant quantum computer is underpinned by the concept that the logical error rate of an encoded computation can be arbitrarily lowered by increasing the size of the code. This concept is based on the idea of fault-tolerance thresholds. The existence of such thresholds for qubitbased architectures has been a subject of extensive research for over twenty years [51,73,74,104,[148][149][150][151][152][153][154][155][156] but the existence of thresholds for CV-based architectures [1] is less well understood. Furthermore, the question of whether hybrid architectures remain fault- Figure 9: The three main steps for state injection of a physical magic state |m into a logical magic state |m of distance d2. A physical magic state and enough data nodes to prepare a distance d1 < d2 code are initialized according to the pattern shown in (a). The stabilizers are then measured for two rounds, shown in (b). If an error is detected, the process is restarted. If no errors are detected, the patch is deformed with data states appropriately chosen until reaching a larger code with distance d2, illustrated in (c), and followed by error correction. To ensure a high probability of success, provided that d2 is large enough, the first stage of the process can be performed in parallel, keeping a single successful instance. Figure 10: In the compilation method we use, non-Clifford gates as in (a) can be implemented by consuming a logical magic state |m , as in (b). The first gate in circuit (b) represents the measurement ofP ⊗Z, withP a Pauli operator. Since both the gates e −i π 4P andP belong to the Clifford group, they can be commuted to the end of the circuit and absorbed in the multi-qubit Pauli measurements. These circuits (drawn using Quantikz [139]) can be straightforwardly generalized to an arbitrary number of qubits. tolerant with probabilistic sources of GKP qubits is not obvious. Here we provide numerical evidence that our architecture does indeed have a threshold in the presence of errors arising from finite squeezing and for a range of swap-out probabilities. As we detail in this section, in order to calculate the threshold, we simulate the hybrid architecture operating as a quantum memory and run a complete error-correction procedure [157]. We detail the various steps involved in the simulation of the thresholds in Algorithm 5.
We now briefly review the numerical procedure for estimating the error threshold of a quantum memory. Consider a family of codes of growing size, parameterized by d. In the case of the RHG lattice, d is the code distance (the weight of the minimal weight nontrivial logical operator) and the number of qubits is n = O(d 3 ). Another parameter is the noise channel, which in our case is described by two numerical parameters: the noise variance δ and the swap-out probability p 0 . To estimate the error threshold, we run many trials of Monte Carlo simulations to determine the logical error rates as a function of our physical noise parameters. This is done for different lattice sizes d. In each trial, we generate homodyne measurement outcomes according to the noise parameters, then we run our error correction procedure (inner decoder followed by outer decoder as described in Section 4), and finally check if error correction has been successful. Let us assume that we fix p 0 and vary δ. Then if a threshold, δ c , exists, we expect to see the following behaviour. For δ > δ c , increasing the size of the code (increasing d) increases the logical error rate. But for δ < δ c , increasing the size of  Figure 11: X correlation surface in the RHG lattice. The blue circles are the primal qubits, the green circles are the dual qubits, and the pink circles are the primal qubits in the X correlation surface. The yellow highlighted edges represent Z operators, i.e. primal qubits on yellow highlighted edges have a Pauli Z applied to them. In a) we show a logical identity operator that commutes with all the stabilizers and the correlation surface, whereas in b) we show a non-trivial logical operator that commutes with all the stabilizers but does not commute with the correlation surface.
the code exponentially decreases the logical error rate. We note that the largest code sizes we consider involve n ≈ 5000 qubits. Simulation of such a large number of qubits is possible due to the fact that we use a classical noise channel to model approximate GKPs and the circuits we simulate belong to the Clifford group, which makes them efficiently classically simulable [158].
While the other steps of Algorithm 5 are relatively straightforward, we explain the success-check step of Algorithm 5. After applying the recovery operation, all the cluster state stabilizers are guaranteed to be satisfied. Therefore, error correction is successful if the product of the qubit error and the recovery operator is a stabilizer (logical identity operator) and error correction fails if the product of the qubit error and the recovery operator is a non-trivial logical operator. Such operators anti-commute with at least one of the correlation surfaces of the cluster state [40]. Fig. 11 shows the X correlation surface (the Z correlation surface is analogous). To summarize, if the product of the qubit error and the recovery operator anti-commutes with either correlation surface, then error correction has failed.
The remainder of this section is structured as follows. In Section 6.1, we describe our simulations in detail and compare the performance of different inner and outer decoding strategies. Then, in Section 6.2, we present the threshold simulation results for our architecture operating as a quantum memory. (c) Figure 12: Performance comparison for the various inner decoders considered for p0 = 0.06. For (a) and (b), standard binning is used as the inner decoder for every node, and the weights assigned to the edges of the matching graph are either (a) all equal, or (b) assigned following Eq. (12). In (c), Algorithm 2 is first used on GKP nodes connected to isolated momentum states, standard binning is used for the remainders, and weights described by Eq. (12) are used in the matching graph.

Simulation Details
Here we provide some details on the simulations performed to find the thresholds. First, we note that we only simulate error correction of the primal lattice nodes, as the error correction problem for the dual lattice nodes is the same and each problem can be solved independently. We consider RHG lattices with size parameterized by d, where the left and right boundaries are equivalent to distance d surface codes, and there are d layers of nodes in between these two boundaries (see Fig. 11).
We now return to the calculation of the matching graph weights (Step 2 in Algorithm 4). The first step is to construct a decoding graph based on the RHG lattice. For the sake of brevity, we will describe this construction for an RHG lattice with periodic boundary conditions (see [159] for the case of lattices with boundaries). We refer to the elements of the decoding graph as vertices and arcs, to avoid confusion with the nodes and edges of the RHG lattice. The decoding graph has a vertex for each six-body X stabilizer acting on the primal qubits of the RHG lattice. These stabilizers are formed from products of cluster state stabilizers surrounding a dual cell. Vertices are connected by arcs if their corresponding stabilizers share a qubit. As each qubit is in the support of two such stabilizers, the arcs of the decoding graph are in a one-to-one correspondence with the primal qubits of the RHG lattice, and hence with a subset of the modes of the cluster state. We assign weights to the arcs of the decoding graph as follows. Consider the mode q corresponding to an arc in the decoding graph. Let m be the number of swapped-out modes neighboring q and let z be the outcome of the homodyne measurement of q. We assign to this mode a heuristic error probability as follows: (12) If a mode has one swapped-out neighbor, then there are no errors due to swap-outs as the net effect of a single swap-out after applying the CZ gates is a stabilizer. In this case, the error probability is the probability of incorrectly binning the state [7], using the standard binning function and assuming a classical noise channel with parameterδ, which we derive fromΣ p . For m ≥ 2, we derive the weights in Eq. (12) from simulations which we detail in Appendix C. The weight of the corresponding arc in the decoding graph is then − log w(z, m,δ) [105].
Given the decoding graph, we construct the matching graph weights as follows. For each pair of vertices in the matching graph, we compute the total weight of the minimum weight path between the corresponding vertices in the decoding graph using Dijkstra's algorithm [160].
Many variants are possible for the inner decoder introduced in Section 4.2. In this work, we considered two simple ones. The first is performing standard binning of the homodyne outcomes, irrespective of the presence of momentum-squeezed state in its vicinity. Second, for those momentum-squeezed states which are isolated from others, in the sense that no connected node is also connected to another squeezed state, a variant of Algorithm 2 is used. The modifications are required because of the variable number of neighbors and signs present in the physical application of the CZ gates. We emphasize that, as mentioned in Section 4.2, more complex strategies can be devised and are likely to improve the overall decoding performance.
Simulation results for both possibilities are shown in Fig. 12, for p 0 = 0.06. Fig. 12 (a) shows results for naïve binning using uniform weights in the matching graphs, while (b) uses weights as described in Eq. (12). In (c), Algorithm 2 is used, and weights are given by Eq. (12); we chose p 0 = 0.06 as a representative example to test-drive Algorithm 2, as it is best suited to cases of isolated swap-outs, which are common for this value of p 0 . Incorporating the analog information into building the matching graph clearly improves the performance, the threshold decreasing from ∼ 15.5 dB to ∼ 12.2 dB, with both variants of the inner decoder. We note that modifying the inner decoder to leverage Algorithm 2 did not result in any significant differences for the thresholds themselves but the failure rates below threshold are equal or lower using Algorithm 2, as we show in Fig. 13. Quantifying and understanding the origin of this effect is left for future work.

Threshold Results
Now we are ready to present the thresholds of our hybrid architecture. Our first result is the error threshold of the RHG-GKP code with approximate GKP states, which we model as ideal states suffering a random displacement with noise variance δ, as discussed in Section 4.1. In our noise model, this corresponds to the limit of no swap-outs, i.e., p 0 = 0. Similar simulations have been carried out in previous works for the toric-GKP code [89] and the surface-GKP code [7,90]. We  We use the fitting procedure described in [122] to systematically obtain our threshold estimate (dashed vertical line).
use standard binning and matching graph weights derived from Eq. (12). We observe an error threshold of ∆ dB = −10 log 10 (δ) ≈ 10.5 dB, which is comparable with results for similar noise models in the aforementioned works. The data are shown in Fig. 14.
As described above, the full noise model we use involves two noise parameters, the noise variance δ and the swap-out probability p 0 ; the error threshold is a line in (δ, p 0 ) parameter space rather than a single point. To estimate this error threshold, we run Monte Carlo simulations as described in Algorithm 5 for different values of δ, p 0 and d (the lattice size). For a particular value of p 0 , we can extract the corresponding threshold δ value by plotting the logical error probabilities, p fail , for a range of values of δ and d. The error threshold is then the point where curves for different d intersect. Equivalently, we can instead fix a value of δ and vary p 0 and d. In the inner decoder we use standard binning, and we use matching graph weights derived from Eq. (12) in the outer decoder. Fig. 15 shows the below-threshold region in (δ, p 0 ) parameter space, alongside an example threshold plot for p 0 = 0.1. We find a high tolerance to swap-outs, with a maximum swap-out threshold of p 0 ≈ 0.236 (for δ = 0). For p 0 = 0, the noise variance error threshold is ≈ 10.5 dB, where the dB value is given by −10 log 10 δ. As expected, an increase in the swapout probability leads to an increase in the squeezing thresholds. For an experimentally accessible [48] squeezing value of 15 dB , our simulations suggest a swap-out threshold of p 0 ≈ 0.133. We note that the noise variance (δ) tolerance of our decoder is markedly better for p 0 0.19 than for values nearer the swap-out threshold. Understanding this behaviour is an open problem, with one possible reason being that the inner and outer decoders we are using for the current simulations might be sub-optimal for this regime. Therefore, to investigate this phenomenon further, we should compare our decoding strategy with e.g. maximum-likelihood decoding, in order to ascertain whether the sharp decrease in performance is a fundamental property or an artifact of our decoding strategy. We leave this analysis for future work.
Previous works [6,46,47] have studied the error threshold of the RHG cluster state model when qubits are erased with some probability. This is a natural noise model in optics and bears some resemblance to our model, as one assumes that the locations of the erasures are known. The relationship between the erasure threshold and the Z error threshold was found to be approximately linear [46] and there is a fundamental erasure threshold of 0.249, which is set by percolation theory [161]. It is difficult to directly compare our results with those of [46] because of the differences in the noise models. However, our swap-out threshold is close to the percolation theory erasure threshold, and it is natural to ask whether we can increase the swap-out threshold beyond the erasure threshold by further optimizing our decoder. There are many ways we could improve our current decoder (see Section 7), so, unless there is a fundamental limit due to percolation of swapouts, we are hopeful that we can surpass the erasure threshold. In addition, the question as to whether our decoder has an advantage over the equivalent erasure decoder for finite values of ∆ dB remains open and could be a subject of future work.

Open Problems
Thus far, we have discussed the theoretical advances made in this architecture. As we detail in the next section, this architecture can provide important advantages over other architectures and platforms. However, there are many ways in which the current architecture can be improved. Here we list some open questions and suggested routes for further improvement of the architecture.
The open problems belong broadly in three categories: hardware-focused improvements to the architecture, better encoding, and better decoding strategies. From the hardware perspective, one of the important open problems is to devise a passive implementation of the hybrid architecture. In the current architecture, the CZ gates on the computational module are active transformations, i.e., they require in-line squeezing and displacements. If such active transformations can be replaced by passive transformations, then this could further simplify the computational module and thus reduce the experimental requirements. Possible paths to explore in this direction could be to obtain a hybrid version of macronode-based architectures reviewed in [14] or perhaps to use techniques demonstrated in [162]. Another challenge is that of reliable state preparation. In particular, a key experimental goal highlighted by this work is to gain access to high-quality GKP qubits whose teeth are squeezed at or exceeding the 15 dB level. This motivates the improvement of existing methods for state generation with GBS devices, which previously considered states with up to 10 dB of per-peak squeezing [27,29,30,163]. Reaching 15 dB states with the techniques from Refs. [27,29,30,163] will involve going to higher-order truncations of the Fock basis, which is computationally more demanding. This motivates the development of new numerical simulation techniques, or the application of breeding or distillation protocols to lower quality states to reach the 15 dB level. Progress in these directions is critical to accurate resource estimates for photonic quantum computing. Though the present architecture only couples GBS devices to optical switches, we leave open the possibility of using any surplus GKP qubits to check and improve the quality of outputs, for example, by using them as flag qubits [164]. Overhead, particularly the required number of bosonic qubit states, can be reduced by using fore-knowledge about the quantum computation. For example, we expect there to be little (if any) benefit to having bosonic qubits in place of squeezed states at q-measured nodes described in Section 5.
From the hardware perspective, yet another important task is to develop more realistic yet tractable noise models. Our main focus here was on a Gaussian noise model for state preparation-in tandem with the non-deterministic nature of the bosonic qubit sources. Further analysis is required to incorporate the non-Gaussian features of approximated states output by the GBS devices, to model other noise sources arising, for example, from the CZ gates and the homodyne detectors, and transmission losses. We anticipate that this task might be simplified in moving to a macro-node based approach, wherein the noise from CZ gates does not enter into the picture at all.
The second category of open problems deals with better strategies for encoding logical qubits in our architecture. This includes strategies for choosing the best qubit encodings, i.e., about the choice of the inner encoding. The current architecture relies heavily on features of the GKP encoding, including the fact that GKP qubits can be entangled with squeezed-light modes via the same optical elements as those that entangle them with other GKP qubits, thus opening the possibility of performing swap-outs. The choice of GKP encoding is also motivated by its near-optimal loss tolerance. That said, it might be possible to use other encodings if they could provide these advantages and overcome some of the shortcomings of GKP qubits (the primary being the low probabilities of state generation or equivalently large multiplexing requirements). In particular, other bosonic encodings might be considered if they are compatible with swap-outs. A more general question that has yet to be explored is the viability of employing other hybrid resource states comprised of different types of bosonic encoded qubits.
While the current work focuses on the RHG lattice as a paradigmatic example of an outer qubit code, significant benefit can be expected by moving to other outer encodings. In particular, as we learn more about the structure of noise in realistic GKP-based computation, this opens the possibility of devising better outer encodings that are tailored for the specific noise structure. Furthermore, noise-tailoring along the lines of [90] may provide substantial enhancement to the thresholds obtained in our hybrid architecture.
The final set of open problems that we discuss are related to better methods for decoding in the current architecture. One question is about the possibility of obtaining a further advantage from accounting for realvalued homodyne outcomes. Although our inner decoder is exploiting the structure of the CV noise, there is still more information that could in principle be exploited, for example, at the level of the outer decoder. Is may be possible to use ideas from analog quantum error correction [23] and maximum-likelihood decoding [89] to further reduce our squeezing thresholds.
The question of optimal methods for decoding is also closely related to more fundamental questions related to swap-out-based resource states. Specifically, what is the fundamental swap-out threshold of our architecture? An alternative to swap-outs (i.e., replacing the GKPnode no-shows with squeezed light modes) is to treat the no-shows directly as erasure errors, but for these the threshold is set by percolation theory to be around 24.9%. Is the swap-out threshold higher than the erasure threshold set by percolation theory [46,161] and in which regions of (δ, p 0 ) parameter space is it beneficial to have swap-outs rather than erasures? With the current decoders, we obtained around 23.6% swap-out threshold, which likely can be improved substantially as numerous upgrades can be made to our inner and outer decoders. We expect that development of an inner decoder that can treat more complicated arrangements of p-squeezed states in the lattice will provide fewer errors in the readout of qubit outcomes. Furthermore, we expect improvements to the outer decoder by using a more sophisticated method for assigning weights that takes the structure of the inner decoder into account.
A direction that becomes especially relevant to our photonics-based approach is that of developing fast decoders. As we detail in the next section, the clock speeds of our architecture could be as fast as GHz. However, to exploit these fast time scales, we need to develop efficient methods of decoding real-valued homodyne signals that need to be processed in order to change the homodyne local-oscillator phase as required by the logical computation. We are confident that solving these open problems will further augment the advantages offered by this architecture and bring us closer to a scalable fault-tolerant quantum computer.

Summary and Technological Advantages
We have proposed a concrete and scalable architecture for quantum computing with light. By using a hybrid resource state that can be generated and manipulated using near-future photonic technology, our architecture synthesizes modern techniques in scalable entangled resource state generation and bosonic codes. This "best of both worlds" hybrid approach comes with a novel error structure that arises from the Gaussian model of state imperfections and the use of probabilistic bosonic qubit sources. Numerical results show that such errors can be handled by our tailored two-tier decoder that makes use of continuous-and discrete-variable syndrome data. We find that fault-tolerant quantum computation is possible in the regime where the swap-out probabilitythe likelihood that that any given bosonic qubit source failed and the input was swapped with a squeezed state -is smaller than ≈ 23.6%. For an experimentally accessible squeezing value of 15 dB, our numerical results suggest that the maximum tolerable swap-out probability is ≈ 13.3%. This level of squeezing has been attained in free-space implementations [48], while the current stateof-the-art level of squeezing demonstrated in integrated sources is 8 dB [165]; it remains an open experimental problem to match the level of squeezing in integrated sources to the free-space record. In the remainder of this section, we discuss some technological aspects of the architecture.
Our architecture provides several important technological advantages over competing architectures on photonic and other platforms. These include its modular nature, minimal cryogenic requirements, and fast clock speeds. The first point deals with modularity: the various aspects of computation -namely state preparation, multiplexing, cluster state generation and measurement -impose different hardware requirements. The distinctions between these requirements allow for a modular design in which different chips are given different tasks. For instance, encoded bosonic states can be generated using non-reconfigurable circuits with low-loss interconnects leading to PNR detectors or perhaps even low-loss connections to PNR detectors that are integrated on the same chip. The stitching of the cluster can also be performed on a non-reconfigurable chip. The measurements on the generated cluster require reconfigurable homodyne detection fed-forward from measurements on other homodyne detectors.
Our architecture also poses minimal cryogenic requirements. The state generation chips require lowloss non-reconfigurable circuits, which motivates onchip PNR detectors with entire chips placed in a cryogenic environment until room temperature PNR technology is available [166]. For this purpose, a static integrated platform will suffice. The entire remaining architecture can operate at room temperature. Specifically, the switching network, required in the state-generation part to boost the production rate of GKP states, can be maintained at room temperature, thus exploiting any delays introduced in extracting the light out of the cryostat. The cluster manipulation requires reconfigurable homodyne detection and delay lines to enable feed-forward, all at room temperature. The generation and manipulation of the cluster can thus be performed in a scalable an integrated manner.
The cryogenic requirements of our architecture are modular: we do not need the extensive connectivity seen in other photonic architectures and other platforms entirely. While other platforms require custombuilt 'jumbo' cryostats [167], our architecture imposes no such requirements as it can make use of small, commercially available cryogenic technology. This can provide significant advantage in the cost and reliability of our quantum computing architecture.
The final technological advantage of the architecture is that it allows for the fastest clock speeds among existing quantum computing architectures, which could enable very low-loss delay lines on the chip. Unless a slower process is present in the final generation procedure, the timescale of the cluster generation and manipulation is ultimately set by the timescales of homodyne detection. This is a positive feature, as homodyne detection can be much faster than PNR detectors used in the multiplexing procedure or threshold detectors used in other photonic encodings. Faster time scales mean that the cluster-generation delay lines are shorter and thus incur lower losses. We expect that these massive technological advantages will make photonics the leading platform for building a fault-tolerant photonic quantum computer.
A Noise Model for a Hybrid RHG Lattice Operating as a Memory In this Appendix, we provide the full details of the error model summarized in Section 4.1, that we in turn used to justify our choice of inner decoder.

A.1.1 Additive Gaussian Noise Channel
The cluster state that the hardware generates will be populated by two kinds of states as mentioned in the main text: the |+ gkp state and the momentumsqueezed state. Since the computer is operating in memory mode, we do not need to consider magic states as one of the possible states prepared. The position wavefunction of the ideal GKP state is [20] where |· q corresponds to a position eigenstate. To model the state initialization error, we apply the singlemode additive Gaussian noise channel given by [98] N Y (ρ) = where Y ≥ 0 is the noise matrix, applied independently on each mode depending on the state that populates it. The Weyl-Heisenberg displacement operator is defined asD(ξ) = exp[iξ T Ωr], where ξ = (ξ q , ξ p ) T ∈ R 2 for a single mode, Ω is the anti-symmetric symplectic metric, andr = (q,p) T . For the GKP states and the momentum states, the corresponding noise matrices are chosen as In other words, we start with ideal |+ gkp states and either apply the noise channel N Y gkp or N Yp with probabilities 1 − p 0 and p 0 , respectively. Note that we know what the state is at each site, so we know which noise channel was applied. For GKP states, although this noise model does not capture the damping of peaks due to finite energy seen in Eq. (2), it captures the broadening of peaks that results from finite-energy effects [1,7]. Similarly, this method approximates the realistic momentum state well in the position basis but gives it a periodic structure in momentum space. These points can be viewed transparently through the Wigner picture. In both cases, the application of a noise channel renders the output states mixed.

A.1.2 The Wigner Picture
The Wigner function for ideal GKP states consists of a linear combination of two-dimensional δ-functions in phase space [20]: Note that the lattice spacing of the Dirac-delta peaks in the momentum direction is twice that of the position direction in the Wigner picture. For a clear diagram of the phase space unit cell distribution, see Fig. 1a of [99]. Treating each δ-function as a Gaussian of infinitely small width in phase space, we see that the effect of the noise channel is to replace the δ-functions with Gaussian distributions with covariance Y gkp , by using Eq. (14). Thus, the linear combination of δ-functions is mapped to a linear combination of Gaussian functions centred at the same points in phase space and with the same weights in the linear combination of Eq. (16).
With regard to the momentum states in the RHG lattice, consider the same noise model of Eq. (14), now instead with covariance Returning again to the Wigner function picture for the |+ gkp state, this noise replaces the δ-functions with Gaussians of covariance Y p ( , δ). In the limit that → 0, we see that for odd values of t, these Gaussians in phase space cancel each other out, while for even values of t, the Gaussians add together. This is due to the phase factor (−1) st in Eq. (16). The resulting phase space distribution is a periodic mixture of p-quadrature eigenstates, each separated by 2 √ π, and each with a Gaussian noise of variance δ/2 applied in the p-quadrature. That is, the noise channel Y p ( , δ) turns |+ gkp into a classical mixture of noisy p-squeezed states. Since we are examining the regime where δ is small, we set δ = , which leads to −1 being large as needed for the states to approach p-squeezed states. While it is the case that this noise model for momentum squeezed states returns a Wigner function which still, in principle, has a periodic structure, we do not expect it to positively bias the decoding procedure. The periodic structure in position space is essentially washed away by the broadness of the envelope of order δ −1 , while the discrete 2 √ π translational symmetry in momentum introduces a mixture of momentum eigenstates, and hence more noise than a pure momentum squeezed state.
The initialization step for all N nodes can therefore be written compactly in one equation as: where ξ = (ξ q , ξ p ) T = (ξ q1 , · · · , ξ q N , ξ p1 , · · · , ξ p N ) T ∈ R 2N , andr = (q 1 , ...,q N ,p 1 , ...,p N ) T , corresponding to N -modes. Here, Σ 0 is a direct sum of matrices, where the i th matrix in the direct sum is either of the form Y gkp or Y p depending on whether the i th mode is a GKP or a p-squeezed state. In other words, where Σ x is a diagonal matrix with elements 1 2δ or δ 2 depending on if the mode is p-squeezed or GKP, respectively.
There are several reasons to model the state preparation error with the noise channel described in Eqs. (14) and (15). For one, there are many physical gates that use a measurement-based squeezing operation [168][169][170] that naturally leads to imperfections modeled as the classical noise channel. Furthermore, this type of noise is closely related to pure loss-following a pure loss channel by an amplifier of the inverse strength leads to a classical noise channel [171][172][173][174]. In settings where loss can be treated this way, such as in measurement imperfections, this relationship would play an important role.
The classical noise channel is easily described in the Heisenberg picture, so we use this representation in our simulations. Let us consider the quadrature operatorŝ r of the N -modes. The noise channel on each mode can be described asr where ξ is a vector of random variables drawn from the corresponding normal distribution Σ 0 associated with the state initialization errors. A final note is that we assume that the CZ gates and the measurement procedure are noiseless. Imperfections in both these modules are likely to reduce the error threshold. Inefficiencies in the measurement outcomes can be modeled as a lossy channel and can be converted into a classical noise channel by virtually applying an amplifier that would affect the measurement readout. Similarly, classical noise channels in the CZ can also be tracked due to the Gaussian nature of the noise. However, for simplicity of the presentation, we leave the analysis of this case and possibly more complicated noise models to future work.

A.2 Propagation of Noise in the Cluster State Preparation
For each node, with probability p 0 prepare a momentum-squeezed state, and with probability (1−p 0 ) prepare a |+ gkp . Next in our model, we apply CZ gates perfectly according to the structure of the cluster state, i.e., apply CZ gates to each pair of qubits connected by an edge. We invert some of the CZ gates to match the CV toric code convention [175]. Since we are operating in memory mode, no further gates are applied before the p-homodyne measurements.
The symplectic transformation for a CZ gate defined as exp[iq 1q2 ] in the (q 1 , q 2 , p 1 , p 2 ) basis ordering is given by where A is a 2×2 adjacency matrix. This motivates the symplectic matrix that links all N optical modes into an RHG lattice as where A RL is the N × N matrix with 1 at position (i, j) if two nodes are entangled with a CZ gate and 0 otherwise, with a suitable parity function dictated by the toric code convention. A RL corresponds to the links depicted in Figs. 2 and 5. It is also instructive to look at the effect of the cluster state preparation on the noise matrix. Under the action of all the CZ gates, the full noise matrix evolves under the symplectic transformation to Since we are mainly concerned with the momentum homodyne measurement values, it turns out that the momentum component of the covariance matrix is useful to write down for subsequent sections. To achieve this, we trace out the position degrees of freedom of the covariance matrix of the noise channel to obtain

A.3 Probability Distribution in Momentum Space
So far we have only focused on the noise model and the correlated noise matrix that one obtains once all the CZ gates have been applied to the initial states in each mode. We now detail the connection between the noise matrix obtained in Eq. (14) and the homodyne distribution. Let us define the unitary corresponding to the symplectic transformation in Eq. (22) asÛ RL . Since the preparation of the RHG cluster state and the noise channel on the initial states are both Gaussian, we can con-jugateÛ RL through the noise matrix to obtain This corresponds to taking the ideal state of the RHG lattice had all the GKP states been initialized perfectly without noise and then applying a correlated multimode Gaussian noise channel with covariance Σ 0 .
To understand the probability distribution produced by conjugating the unitary through the Gaussian noise channel, we first show that the probability distribution in p-quadrature of a state under a Gaussian noise channel is given by the convolution of the noiseless probability distribution with the marginal Gaussian distribution of the noise channel along the p-quadrature.
Consider a Gaussian random displacement channel applied to a state: (27) Next we note we can break the displacement into dis-placementsX(·) andẐ(·) along q and p in phase space, respectively, along with a phase factor: D(ξ) |p = e iξx·ξp/2X (ξ x )Ẑ(ξ p ) |p = e −iξx·ξp/2 e −iξx·p |p + ξ p . Thus: Putting these equations together, we find: where in the last step we performed the Gaussian integral over ξ x .
Therefore, returning to the probability distribution in p-space of our hybrid lattice, we find: where ψ RL (p) is the wavefunction in p-space of the ideal RHG cluster state and Σ p was defined in Eq. (25). We know that |ψ RL (p)| 2 consists of a lattice in p-space, where each point of the lattice is located at n √ π, where n is an integer-valued N -component vector chosen from a set dictated by the ideal qubit state of the RHG lattice. The addition of the Gaussian noise channel broadens each of these lattice points into Gaussian functions with covariance Σ p . Therefore, we see that we can interpret homodyne momentum outcomes as being sampled from the noise matrix Σ p using Eq. (32). Given the measurement outcomes, we then apply a classical decoder to these values to yield us the net recovery operation.

B Optical Components for GKP Qubit Operations
A primary advantage of the GKP qubit encoding is the fact that Clifford gates and measurements correspond to CV Gaussian gates and measurements. In Fig. 16, we provide optical circuits for the application of GKP Clifford gates and measurements in an optical setting. These circuits present how the gates would be implemented in a circuit-based setting. In contrast, the actual gates on our physical qubits are implemented in a measurement-based manner, and hence their implementation would only involve performing homodyne measurements on the computational resource state. Thus, this section is included for completeness of the background material and to demonstrate the accessibility of Gaussian resources in optics, rather than as the actual implementation of how the gates would be performed in our architecture.
For the non-Clifford T gate, non-Gaussian CV gates are required. In Fig. 17, we provide an optical circuit for T gate application via gate teleportation using a GKP magic state as a resource. In [30], it was observed that magic states and Pauli basis states are comparably resource-intensive to produce using GBS state preparation.   Figure 16: A review of optical implementations of the gates and measurements required for Clifford operations in the GKP encoding, including limits required to achieve ideal, perfect CV gate application. (a) A general displacement module [176]. Displacement by √ π in q (p) corresponds to a GKP qubit Pauli X (Z) gate. (b) Rotation module as performed by e.g. an optical thermoelectric heating element. φ = π/2 corresponds to the CV Fourier transform as well as the GKP-qubit Hadamard gate. (c) Homodyne measurement module. Changing the rotation φ changes the axis in phase space along which the measurement is performed. φ = 0 (π/2) corresponds to q (p) homodyne measurement, which is the GKP qubit Pauli Z (X) measurement. (d) Measurement-based squeezing module [168]. On-demand, in-line squeezing is in general required for implementing CV quadratic phase and Controlled-X/phase gates, and a measurement-based approach allows for offline preparation of squeezed resource state. (e) Quadratic phase gate module [177]. s = ±1 corresponds to the GKP qubit phase gate. (f) CV CZ gate module. s = ±1 corresponds to the GKP qubit CZ gate. Application of π/2 rotations on the second mode before and after the CZ gate implements a CV CX gate [177] with Target state 1 becoming the control and Target state 2 becoming the target, and thus a GKP qubit CN OT gate.  Figure 17: Optical implementation of the GKP qubit T gate up to global phase, following the method from [20]. Here, in the ideal limit, |M = e −iπ/8 |+ gkp + e iπ/8 |− gkp , and the feedforward phase gate is applied if the ancillary mode detects |− gkp via a qubit X measurement (CV p homodyne).

C Heuristic Weights for the Outer Decoder
In the outer decoder algorithm detailed in Section 4.3 and applied in Section 6.1, we have the opportunity to assign different weights in the outer decoder for the RHG lattice depending on the expected error at a site.
In the most naïve approach, one could simply use the marginal probability that a node undergoes a phase flip to determine the weight; however, this does not take into account correlated phase flips that we expect to see from replacing some nodes with p-squeezed states, as we have discussed previously. For instance, in either the case of applying a ring of four Z gates or of applying the identity, neither result in any error in the decoding, but if we simply looked at the marginal probabilities of the sites in the ring, we might pessimistically assume each site has an independent probability of incurring a phase flip, which would result in pessimistic weight assignment. While a full analysis of correlated errors and weight assignments for general configurations of p-squeezed states and GKP states is left to future work, the heuristic choice for weight assignments given by Eq. (12) can be found by considering a simple configuration. Amazingly, this choice of weights already provides a significant improvement over the use of marginal probability of error at each site. Here, we detail the motivation for this choice of heuristic weights.
Consider a single node e 0 in the RHG lattice surrounded by four neighbors e 1 , e 2 , e 3 , e 4 , which can be either GKP or p-squeezed, so that e 0 can have anywhere from 0 to 4 p-squeezed states as neighbors. For simplicity, we will assume that the next layers of neighbors of e 1 , e 2 , e 3 , e 4 are GKP states. See Figs. 2 and 5 for a visualization of the lattice configuration. Whether e 0 is GKP or p-squeezed does not impact the following argument. Additionally, we will assume the limit of infinite squeezing (δ → 0) for all the sites. Note that these assumptions are used to select a choice of weights, not to run the actual simulation, so discrepancies between the assumptions for choosing weights and the actual parameters of the simulation will nonetheless result in a perfectly usable, albeit suboptimal, set of weights.
In this scenario, for each node that is a momentum eigenstate, it imparts a random displacement -sampled from the uniform distribution of its q-quadrature since we assumed δ → 0 -onto the p-quadrature of its neighbors via the action of the CV CZ gates, while each node that is a perfect GKP state does not impart any random displacements onto its neighbors. Let the displacements from e 1 , e 2 , e 3 , e 4 be d 1 , d 2 , d 3 , d 4 , where we specifically mean the excess displacement beyond n √ π [7]. Thus, the displacement on e 0 is given by d 1 + d 2 + d 3 + d 4 , assuming the CZ gates are all +1; changing the sign of the CZ gates does not change the argument since d 1 , d 2 , d 3 , d 4 are sampled from symmetric distributions. Moreover, let b i , i = 1, 2, 3, 4 be the binary value returned by performing standard GKP binning on the homodyne output of the neighbors of node e i other than e 0 ; note that the only shared neighbor of e 1 , e 2 , e 3 , e 4 is e 0 , and since we assumed the nodes beyond e 1 , e 2 , e 3 , e 4 were GKP, then we know we will get the same singular outcome b i on all neighbors of e i other than e 0 . Let b 0 be the binary value returned by performing standard GKP binning on the homodyne output of e 0 .
Consider now the scenarios that would cause an error at the level of the qubit decoder. We know that a closed ring of Z gates commutes with the stabilizers of the RHG lattice. If only one of e 1 , e 2 , e 3 , e 4 is a momentum eigenstate, then we have already shown that the resulting effect on the binary outcomes b i is a ring of four Z gates around e i , which we know causes no problem. If two or more of e 1 , e 2 , e 3 , e 4 are momentum eigenstates, then we now have potential for error when using standard binning. In particular, for the readout to correspond to a closed ring of Z gates, we require that (b 0 + b 1 + b 2 + b 3 + b 4 ) mod 2 = 0. This condition will always be true if only one of d 1 , d 2 , d 3 , d 4 is sampled from uniform (since we assumed δ → 0) while the others are zero, since d 1 + d 2 + d 3 + d 4 will then be equal to the only non-zero displacement. We find that if two, three or four of d 1 , d 2 , d 3 , d 4 are nonzero, then the condition (b 0 + b 1 + b 2 + b 3 + b 4 ) mod 2 = 0 is violated with probabilities of approximately 0.25, 0.33, and 0.40, respectively. This means that even with multiple nodes replaced by p-squeezed states, the probability of the resulting readout indicating a series of gates that do not commute with the stabilizer is less than 50%, which would be the marginal probability of phase flip at each site. Finally, we use these probabilities of error to assign heuristic, relative weighting in the outer decoder.