Optimal local unitary encoding circuits for the surface code

The surface code is a leading candidate quantum error correcting code, owing to its high threshold, and compatibility with existing experimental architectures. Bravyi et al. (2006) showed that encoding a state in the surface code using local unitary operations requires time at least linear in the lattice size $L$, however the most efficient known method for encoding an unknown state, introduced by Dennis et al. (2002), has $O(L^2)$ time complexity. Here, we present an optimal local unitary encoding circuit for the planar surface code that uses exactly $2L$ time steps to encode an unknown state in a distance $L$ planar code. We further show how an $O(L)$ complexity local unitary encoder for the toric code can be found by enforcing locality in the $O(\log L)$-depth non-local renormalisation encoder. We relate these techniques by providing an $O(L)$ local unitary circuit to convert between a toric code and a planar code, and also provide optimal encoders for the rectangular, rotated and 3D surface codes. Furthermore, we show how our encoding circuit for the planar code can be used to prepare fermionic states in the compact mapping, a recently introduced fermion to qubit mapping that has a stabiliser structure similar to that of the surface code and is particularly efficient for simulating the Fermi-Hubbard model.


Introduction
One of the most promising error correcting codes for achieving fault-tolerant quantum computing is the surface code, owing to its high threshold and low weight check operators that are local in two dimensions [18,29]. The stabilisers of the surface code are defined on the faces and sites of a L × L square lattice embedded on either a torus (the toric code) or a plane (the planar code). The toric code encodes two logical qubits, while the planar code encodes a single logical qubit.
An important component of any quantum error Oscar Higgott: oscar.higgott.18@ucl.ac.uk correction (QEC) code is its encoding circuit, which maps an initial product state of k qubits in arbitrary unknown states (along with n − k ancillas) to the same state on k logical qubits encoded in a quantum code with n physical qubits. The encoding of logical states has been realised experimentally for the demonstration of small-scale QEC protocols using various codes [14,16,22,28,35,37,39,40,43,45,49,51,52], however one of the challenges of realising largerscale experimental demonstrations of QEC protocols is the increasing complexity of the encoding circuits with larger system sizes, which has motivated the recent development of compiling techniques that reduce the number of noisy gates in unitary encoding circuits [56].
Encoding circuits can also be useful for implementing fermion-to-qubit mappings [46], an important component of quantum simulation algorithms, since some mappings introduce stabilisers in order to mitigate errors [27] or enforce locality in the transformed fermionic operators [9,25,48,50]. Local unitary encoding circuits provide a method to initialise and switch between mappings without the need for ancilla-based stabiliser measurements and feedback.
The best known local unitary circuits for encoding an unknown state in the surface code are far from optimal. Bravyi et al. [7] showed that any local unitary encoding circuit for the surface code must take time that is at least linear in the distance L, however the most efficient known local unitary circuit for encoding an unknown state in the surface code was introduced by Dennis et al. [18], and requires Ω(L 2 ) time to encode an unknown state in a distance L planar code. Aguado and Vidal [1] introduced a Renormalisation Group (RG) unitary encoding circuit for preparing and unknown state in the toric code with O(log L) circuit depth, however their method requires non-local gates. More recently, Aharonov and Touati provided an Ω(log L) lower bound on the circuit depth of preparing toric code states with non-local gates, demonstrating that the RG encoder is optimal in this setting [2], and an alternative approach for preparing a specific state in the toric code with non-local gates and depth O(log L) was recently introduced in Ref. [34]. Dropping the requirement of unitarity, en-coders have been found that use stabiliser measurements [26,33,36] or local dissipative evolution [30], and it has been shown that local dissipative evolution cannot be used to beat the Ω(L) lower bound for local unitary encoders [31]. If only the logical| 0 state is to be prepared, then stabiliser measurements [18] can be used, as well as optimal local unitaries that either use adiabatic evolution [24] or a mapping from a cluster state [12]. However, encoding circuits by definition should be capable of encoding an arbitrary unknown input state.
In this work, we present local unitary encoding circuits for both the planar and toric code that take time linear in the lattice size to encode an unknown state, achieving the Ω(L) lower bound given by Bravyi et al. [7]. Furthermore, we provide encoding circuits for rectangular, rotated and 3D surface codes, as well as a circuit that encodes a toric code from a planar code. Our circuits also imply optimal encoders for the 2D color code [32], some 2D subsystem codes [6,8] and any 2D translationally invariant topological code [6]. On many Noisy Intermediate-Scale Quantum (NISQ) [41] devices, which are often restricted to local unitary operations, our techniques therefore provide an optimal method for experimentally realising topological quantum order. Another advantage of using a unitary encoding circuit is that it does not require the use of ancillas to measure stabilisers, therefore providing a more qubit efficient method of preparing topologically ordered states (2× fewer qubits are required to prepare a surface code state of a given lattice size). Finally, we show how our unitary encoding circuits for the planar code can be used to construct O(L) depth circuits to encode a Slater determinant state in the compact mapping [19], which can be used for the simulation of fermionic systems on quantum computers.

Stabiliser codes
An n-qubit Pauli operator P = αP n where P n ∈ {I, X, Y, Z} ⊗n is an n-fold tensor product of single qubit Pauli operators with the coefficient α ∈ {±1, ±i}. The set of all n-qubit Pauli operators forms the n-qubit Pauli group P n . The weight wt(P ) of a Pauli operator P ∈ P n is the number of qubits on which it acts non-trivially. Any two Pauli operators commute if an even number of their tensor factors commute, and anti-commute otherwise.
Stabiliser codes [23] are defined in terms of a stabiliser group S, which is an abelian subgroup of P n that does not contain the element −I. Elements of a stabiliser group are called stabilisers. Since every stabiliser group is abelian and Pauli operators have the eigenvalues ±1, there is a joint +1-eigenspace of every stabiliser group, which defines the stabiliser code.
The check operators of a stabiliser code are a set of generators of S and hence all measure +1 if the state is uncorrupted. Any check operator M that anticommutes with an error E will measure -1 (since M E |ψ = −EM |ψ = −E |ψ ). The centraliser C(S) of S in P n is the set of Pauli operators which commute with every stabiliser. If an error E ∈ C(S) occurs, it will be undetectable. If E ∈ S, then it acts trivially on the codespace, and no correction is required. However if E ∈ C(S) \ S, then an undetectable logical error has occurred. The distance d of a stabiliser code is the smallest weight of any logical operator.
A stabiliser code is a Calderbank-Shor-Steane (CSS) code if there exists a generating set for the stabiliser group such that every generator is in {I, X} n ∪ {I, Z} n .

The Surface Code
The surface code is a CSS code introduced by Kitaev [18,29], which has check operators defined on a square lattice embedded in a two-dimensional surface. Each site check operator is a Pauli operator in {I, X} n which only acts non-trivially on the edges adjacent to a vertex of the lattice. Each plaquette check operator is a Pauli operator in {I, Z} n which only acts non-trivially on the edges adjacent to a face of the lattice. In the toric code, the square lattice is embedded in a torus, whereas in the planar code the lattice is embedded in a plane, without periodic boundary conditions (see Figure 1). These site and plaquette operators together generate the stabiliser group of the code. While the toric code encodes two logical qubits, the surface code encodes a single logical qubit.

Encoding an unknown state
We are interested in finding a unitary encoding circuit that maps a product state |φ 0 ⊗ . . . ⊗ |φ k−1 ⊗ |0 ⊗(n−k) of k physical qubits in unknown states (along with ancillas) to the state of k logical qubits encoded in a stabiliser code with n physical qubits. Labelling the ancillas in the initial state k, k + 1, . . . , n − 1, we note that the initial product state is a +1eigenstate of the stabilisers Z k , Z k+1 , . . . , Z n−1 . Thus, we wish to find a unitary encoding circuit that maps the stabilisers Z k , Z k+1 , . . . , Z n−1 of the product state to a generating set for the stabiliser group S of the code. The circuit must also map the logical operators Z 0 , Z 1 , . . . , Z k−1 and X 0 , X 1 , . . . , X k−1 of the physical qubits to the corresponding logical operators Z 0 ,Z 1 , . . . ,Z k−1 andX 0 ,X 1 , . . . ,X k−1 of the encoded qubits (up to stabilisers). Applying a unitary U to an eigenstate |ψ of an operator S (with eigenvalue s) gives U S |ψ = sU |ψ = U SU † U |ψ : an eigenstate of S becomes an eigenstate of U SU † . Therefore, we wish to find a unitary encoding circuit that, acting under conjugation, transforms the stabilisers and logicals of the initial product state into the stabilisers and logicals of the encoded state.
The CNOT gate, acting by conjugation, transforms Pauli X and Z operators as follows: and leaves ZI and IX invariant. Here σσ for σ, σ ∈ {I, Z, X} denotes σ C ⊗ σ T with C and T the control and target qubit of the CNOT respectively. Since Z = HXH and X = HZH, a Hadamard gate H transforms an eigenstate of Z into an eigenstate of X and vice versa. We will show how these relations can be used to generate unitary encoding circuits for the surface code using only CNOT and Hadamard gates.
As an example, consider the problem of generating the encoding circuit for the repetition code, which has stabilisers Z 0 Z 1 and Z 1 Z 2 . We start in the product state |φ |0 |0 which has stabilisers Z 1 and Z 2 . We first apply CNOT 01 which transforms the stabiliser Z 1 → Z 0 Z 1 and leaves Z 2 invariant. Then applying CNOT 12 transforms Z 2 → Z 1 Z 2 and leaves Z 0 Z 1 invariant. We can also verify that the logical X undergoes the required transformation X 0 →X 0 := X 0 X 1 X 2 .

General Encoding Methods for Stabiliser Codes
There exists a general method for generating an encoding circuit for any stabiliser code [15,23], which we review in Appendix A. The specific structure of the output of this method means it can immediately be rearranged to depth O(n). Using general routing procedures presented in [4,11,13] the output circuit could be adapted to a surface architecture with overhead O( √ n), giving a circuit with depth O(n √ n). This matches the scaling O(min(2n 2 , 4nD∆)) in depth for stabiliser circuits achieved in [55], where D and ∆ Figure 2: Circuit to encode a distance 6 planar code from a distance 4 planar code. Each edge corresponds to a qubit. Each arrow denotes a CNOT gate, pointing from control to target. Filled black circles (centred on edges) denote Hadamard gates, which are applied at the beginning of the circuit. The colour of each CNOT gate (arrow) denotes the time step in which it is applied. The first, second, third and fourth time steps correspond to the blue, green, red and black CNOT gates respectively. Solid edges correspond to qubits originally encoded in the L=4 planar code, whereas dotted edges correspond to additional qubits that are encoded in the L=6 planar code.
are the diameter and degree respectively of the underlying architecture graph. Any stabiliser circuit has an equivalent skeleton circuit [38], and so can be implemented on a surface architecture with depth O(n) = O(L 2 ), matching the previously best known scaling [18] for encoding the planar code. O(n) is an optimal bound on the depth of the set of all stabiliser circuits [38], so we look beyond general methods and work with the specifics of the planar encoding circuit to improve on [18].

Optimal encoder for the planar code
Dennis et al. [18] showed how the methods outlined in section 4 can be used to generate an encoding circuit for the planar surface code. The inductive step in their method requires Ω(L) time steps and encodes a distance L + 1 planar code from a distance L code by turning smooth edges into rough edges and vice versa. As a result encoding a distance L planar code from an unencoded qubit requires Ω(L 2 ) time steps, which is quadratically slower than the lower bound given by Bravyi et al. [7].
However, here we present a local unitary encoding circuit for the planar code that requires only 2L time steps to encode a distance L planar code. The inductive step in our method, shown in Figure 2 for L = 4, encodes a distance L + 2 planar code from a distance L planar code using 4 time steps, and does not rotate the code. This inductive step can then be  Figure 2 is applied. Top: the four main types of site stabilisers acted on nontrivially by the encoding circuit (labelled a-d) are shown in red before (left) and after (right) the encoding circuit is applied. On the left we assume that the ancillas have already been initialised in the |+ state (H applied). Bottom: the four main types of plaquette stabilisers (also labelled a-d) are shown in blue before (left) and after (right) the encoding circuit is applied. Plaquette c has two connected components after the circuit is applied (right), and is enclosed by a green dashed line for clarity.
used recursively to encode an unencoded qubit into a distance L planar code using 2L time steps. If L is odd, the base case used is the distance 3 planar code, which can be encoded in 6 time steps. If L is even, a distance 4 planar code is used as a base case, which can be encoded in 8 time steps. Encoding circuits for the distance 3 and 4 planar codes are given in Appendix B. Our encoding circuit therefore matches the Ω(L) lower bound provided by Bravyi et al. [7].
Since the circuit for the inductive step in Figure 2 uses only CNOT and H gates, we can verify its correctness by checking that stabiliser generators and logicals of the distance L surface code are mapped to stabiliser generators and logicals of the distance L + 2 surface code using the conjugation rules explained in Section 4. We show how each type of site and plaquette stabiliser generator is mapped by the inductive step of the encoding circuit in Figure 3. Note that the site stabiliser generator labelled c (red) is mapped to a weight 7 stabiliser in the L = 6 planar code: this is still a valid generator of stabiliser group, and the standard weight four generator can be obtained by multiplication with a site of type b. Similarly, the plaquette stabiliser generator labelled c becomes weight 7, but a weight four generator is recovered from multiplication by a plaquette of type a. Therefore, the stabiliser group of the L = 4 planar code is mapped correctly to that of the L = 6 planar code, even though minimum-weight generators are not mapped explicitly to minimum-weight generators. Using Equation (1) it is straightforward to verify that the X and Z logical operators of the L = 4 planar code are also mapped to the X and Z logicals of the L = 6 planar code by the inductive step.
We can also encode rectangular planar codes with height H and width W by first encoding a distance min(H, W ) square planar code and then using a subset of the gates in Figure 2 (given explicitly in Appendix B) to either increase the width or the height as required. Increasing either the width or height by two requires three time steps, therefore encoding a H × W rectangular planar code from an unencoded qubit requires 2 min(H, W ) + 3 |H−W | 2 time steps. In Appendix B.2 we also provide an optimal encoder for the rotated surface code, which uses fewer physical qubits for a given distance L [5]. Our encoding circuit also uses an inductive step that increases the distance by two using four time steps, and therefore uses 2L + O(1) time steps to encode a distance L rotated surface code.

Local Renormalisation Encoder for the Toric Code
In this section we will describe an O(L) encoder for the toric code based on the multi-scale entanglement renormalisation ansatz (MERA). The core of this method is to enforce locality in the Renormalisation Group (RG) encoder given by Aguado and Vidal [1]. The RG encoder starts from an L = 2 toric code and then uses an O(1) depth inductive step which enlarges a distance 2 k code to a distance 2 k+1 code, as shown in Figure 4 for the first step (k = 1) (and reviewed in more detail in Appendix C). The L = 2 base case toric code can be encoded using the method given by Gottesman in Ref. [23], as shown in Appendix C.1. While the RG encoder takes O(log L) time, it is nonlocal in it's original form.
In order to enforce locality in the RG encoder, we wish to find an equivalent circuit that implements an identical operation on the same input state, using quantum gates that act locally on the physical architecture corresponding to the final distance L toric code (here a gate is local if it acts only on qubits that belong to either the same site or plaquette). One approach to enforce locality in a quantum circuit is to insert SWAP gates into the circuit to move qubits adjacent to each other where necessary. Any time step of a quantum circuit can be made local on a L × L 2D nearest-neighbour (2DNN) grid architecture using at most O(L) time steps, leading to at most a multiplicative O(L) overhead from enforcing locality [4,11,13]. Placing an ancilla in the centre of each site and plaquette, we see that the connectivity graph of our physical architecture has a 2DNN grid as a subgraph. Therefore, using SWAP gates to However, we can achieve O(L) complexity by first noticing that all 'quantum circuit' qubits which are acted on non-trivially in the first k steps of the RG encoder can be mapped to physical qubits in a 2 k+1 × 2 k+1 square region of the physical architecture. Therefore, the required operations in iteration k can all be applied within a 2 k+1 × 2 k+1 region that also encloses the regions used in the previous steps. In Appendix C.2 we use this property to provide circuits for routing quantum information using SWAP gates (and no ancillas) that enforce locality in each of the O(1) time steps in iteration k using O(2 k+1 ) time steps. This leads to a total complexity of for encoding a distance L code, also achieving the lower bound given by Bravyi et al. [7]. In Appendix C.2 we provide a more detailed analysis to show that the total time complexity is 15L/2 − 6 log 2 L + 7 ∼ O(L). Unlike the other encoders in this paper (which work for all L), the RG encoder clearly can only be applied when L is a power of 2.  Figure 5: Circuit to encode a distance 5 toric code from a distance 5 planar code. Solid edges correspond to qubits in the original planar code and dotted edges correspond to qubits added for the toric code. Opposite edges are identified. Arrows denote CNOT gates, and filled black circles denote Hadamard gates applied at the beginning of the circuit. Blue and green CNOT gates correspond to those applied in the first and second time step respectively. Red CNOTs are applied in the time step that they are numbered with. The hollow circles denote the unencoded qubit that is to be encoded into the toric code.

Encoding a toric code from a planar code
While the method in section 6 is only suitable for encoding planar codes, we will now show how we can encode a distance L toric code from a distance L planar code using only local unitary operations. Starting with a distance L planar code, 2(L − 1) ancillas each in a |0 state, and an additional unencoded logical qubit, the circuit in Figure 5 encodes a distance L toric code using L + 2 time steps. The correctness of this step can be verified using Equation (1): each ancilla initialised as |0 (stabilised by Z) is mapped to a plaquette present in the toric code but not the planar code. Likewise, each ancilla initialised in |+ using an H gate (stabilised by X) is mapped to a site generator in the toric code but not the planar code. The weightthree site and plaquette stabilisers on the boundary of the planar code are also mapped to weight four stabilisers in the toric code. Finally, we see that X and Z operators for the unencoded qubit (the hollow circle in Figure 5) are mapped to the second pair of X and Z logicals in the toric code by the circuit, leaving the other pair of X and Z logicals already present from the planar code unaffected.
Therefore, encoding two unencoded qubits in a toric code can be achieved using 3L+2 time steps using the circuits given in this section and in section 6. Similarly, we can encode a planar code using the local RG encoder for the toric code, before applying the inverse of the circuit in Figure 5.  Figure 6: (a) Circuit to encode a 4 × 2 planar code from a four qubit repetition code (where adjacent qubits in the repetition code are stabilised by XX). Applied to a column of qubits corresponding to a surface codeZ, this encodes a layer in the yz-plane of a 3D surface code. (b) Circuit to encode the xz-plane of a 3D surface code once the yzplane layers and a layer in the xy-plane have been encoded. Arrows denote CNOT gates pointing from control to target, and blue, green, red and black CNOT gates correspond to the first, second, third and fourth time steps respectively. Solid and dotted edges correspond to qubits that are initially entangled and in a product state respectively.

Encoding a 3D Surface Code
We will now show how the techniques developed to encode a 2D planar code can be used to encode a distance L 3D surface code using O(L) time steps. We first encode a distance L planar code using the method given in section 6. This planar code now forms a single layer in the xy-plane of a 3D surface code (where the y-axis is defined to be aligned with a Z-logical in the original planar code). Using the circuit given in Figure 6(a), we encode each column of qubits corresponding to a Z logical in the planar code into a layer of the 3D surface code in the yz-plane (which has the same stabiliser structure as a planar code if the rest of the x-axis is excluded). Since each layer in the yz-plane can be encoded in parallel, this stage can also be done in O(L) time steps. If we encode each layer in the yz-plane such that the original planar code intersects the middle of each layer in the yz-plane, then each layer in the xz-plane now has the stabiliser structure shown in Figure 6(b). Using the circuit in Figure 6 10 Encoding circuit for the compact mapping Fermion to qubit mappings are essential for simulating fermionic systems using quantum computers, and an encoding circuit for such a mapping is an important subroutine in many quantum simulation algo-rithms. We now show how we can use our encoding circuits for the surface code to construct encoding circuits that prepare fermionic states in the compact mapping [19], a fermion to qubit mapping that is especially efficient for simulating the Fermi-Hubbard model. A fermion to qubit mapping defines a representation of fermionic states in qubits, as well as a representation of each fermionic operator in terms of Pauli operators. Using such a mapping, we can represent a fermionic Hamiltonian as a linear combination H = i α i P i of tensor products of Pauli operators P i , where α i are real coefficients. We can then simulate time evolution e −iHt of H (e.g. using a Trotter decomposition), which can be used in the quantum phase estimation algorithm to determine the eigenvalues of H. The mapped Hamiltonian H can also be used in the variational quantum eigensolver algorithm (VQE), where we can estimate the energy ψ| H |ψ of a trial state |ψ by measuring each Pauli term ψ| P i |ψ individually.
The Jordan-Wigner (JW) transformation maps fermionic creation (a † i ) and annihilation (a i ) operators to qubit operators in such a way that the canonical fermionic anti-commutation relations are satisfied by the encoded qubit operators. The qubit operators used to represent a † i and a i are where σ + := (X i − iY i )/2 and σ − := (X i + iY i )/2. Each electronic basis state (with m modes) in the JW transformation is represented by m qubits simply as a computational basis state |ω 1 , ω 2 , . . . , ω m where ω i = 1 or ω i = 0 indicates that mode i is occupied or unoccupied by a fermion, respectfully. A drawback of the Jordan-Wigner transformation is that, even if a fermionic operator acts on O(1) modes, the corresponding JW-mapped qubit operator can still act on up to O(m) qubits. When mapped qubit operators have larger weight, the depth and number of gates required to simulate time evolution of a mapped Hamiltonian also tend to increase, motivating the design of fermion-to-qubit mappings that map fermionic operators to qubit operators that are both low weight and geometrically local.
Several methods have been proposed for mapping geometrically local fermionic operators to geometrically local qubit operators [9,19,27,47,48,50,54], all of which introduce auxiliary qubits and encode fermionic Fock space into a subspace of the full nqubit system, defined as the +1-eigenspace of elements of a stabiliser group S. Mappings that have this property as referred to as local.
We will now focus our attention on a specific local mapping, the compact mapping [19], since its stabiliser group is very similar to that of the surface code.
As we will see, this close connection to the surface code allows us to use the encoding circuits we have constructed for the surface code to encode fermionic states in the compact mapping. The compact mapping maps nearest-neighbour hopping (a † i a j + a † j a i ) and Coulomb (a † i a i a † j a j ) terms to Pauli operators with weight at most 3 and 2, respectfully, and requires 1.5 qubits for each fermionic mode [19]. Rather than mapping individual fermionic creation and annihilation operators, the compact mapping instead defines a representation of the fermionic edge (E jk ) and vertex (V j ) operators, defined as where γ j := a j +a † j andγ j := (a j −a † j )/i are Majorana operators. The vertex and edge operators must satisfy the relations for all i = j = l = n, and In the compact mapping, there is a "primary" qubit associated with each of the m fermionic modes, and there are also m/2 "auxiliary" qubits. Each vertex operator V j is mapped to the Pauli operator Z j on the corresponding primary qubit. We denote the mapped vertex and edge operators byṼ j andẼ ij , respectfully, and so we haveṼ j := Z j . Each edge operator E ij is mapped (up to a phase factor) to a three-qubit Pauli operator of the form XY X or XY Y , with support on two vertex qubits and a neighbouring "face" qubit. The precise definition of the edge operators is not important for our purposes, and we refer the reader to Ref. [19] for details. The vertex and edge operators define a graph (in which they correspond to vertices and edges, respectfully), and an additional relation that must be satisfied in the mapping is that the product of any loop of edge operators must equal the identity: where here p = {p 1 , p 2 , . . .} is a sequence of vertices along any cycle in the graph. The relation of Equation (8) can be satisfied by ensuring that the qubit operator corresponding to any mapped loop of edge operators is a stabiliser, if it is not already trivial, thereby ensuring that the relations are satisfied within the +1-eigenspace of the stabilisers. The stabiliser group S of the compact mapping is therefore defined by Equation (8) and the definition of eachẼ ij . The +1-eigenspace of S has dimension 2 m+∆ , where m is the number of modes and ∆ ∈ {−1, 0, 1} is the disparity, which depends on the Figure 7: The stabilisers of the compact mapping. A primary qubit is associated with each black circle, and an auxiliary qubit is associated with each edge of the surface code lattice. There is a plaquette stabiliser (blue) associated with each face of the surface code lattice, acting as Y XXY on the edges adjacent to the face, and as Z on each of the four closest primary qubits. There is also a site stabiliser (red) associated with each vertex of the surface code lattice, also acting as Y XXY on the edges adjacent to the vertex, and as Z on each of the four closest primary qubits.
boundary conditions chosen for the square lattice geometry. We will only consider the case where ∆ = 1, since this choice results in a stabiliser structure most similar to the surface code. In this ∆ = 1 case the full Fock space is encoded, along with a topologically protected logical qubit. The stabilisers of the compact mapping (for the case ∆ = 1) are shown in Figure 7, from which it is clear that the stabiliser group is very similar to that of the planar surface code, a connection which was first discussed in Ref. [19]. Indeed, if we consider the support of the stabilisers on only the auxiliary qubits (associated with the edges of the surface code lattice shown in Figure 7), we recover the stabiliser group of the planar surface code up to single-qubit Clifford gates acting on each qubit. Using this insight, we can use our surface code encoding circuit to construct a local unitary encoding circuit that prepares a Slater determinant state in the compact mapping, which is often required for its use in quantum simulation algorithms. Note that we can write each fermionic occupation operator a † j a j for mode j in terms of the corresponding vertex operator V j as a † j a j = (I − V j )/2, where I is the identity operator. A Slater determinant state |φ det is then a joint eigenstate of the stabilisers and vertex operators: where S is the stabiliser group of the mapping,Ṽ is the set of mapped vertex operators, and v j ∈ {+1, −1} indicates whether mode j is occupied (-1) or unoccupied (+1) [27]. Let us denote the set of generators of S defined by the sites and plaquettes in Figure 7 by {s 1 , s 2 , . . . , s r } (i.e. S = s 1 , s 2 , . . . , s r ). For any Pauli operator c, we denote its component acting only on the primary qubits as c p , and its component acting only on auxiliary qubits is denoted c a . With this notation we can decompose each stabiliser generator as s i = s p i ⊗ s a i , where |s p i | = |s a i | = 4 in the bulk of the lattice. For the compact mapping, whereṼ j := Z j , from Equation (10) we see that the primary qubits are in a product state for all Slater determinant states, and so we can write the state of the system on all qubits as |φ = |φ p ⊗ |φ a , where |φ p is the state of the primary qubits and |φ a is the state of the auxiliary qubits.
Our circuit to prepare a Slater determinant state in the compact mapping then proceeds in three steps. In step one we prepare each primary qubit in state |0 or |1 if the corresponding fermionic mode is unoccupied or occupied, respectfully. This ensures that the state satisfies Equation (10) as required, and we denote the resultant state on the primary qubits by |φ det p . It now remains to show how we can prepare the state on the auxiliary qubits such that Equation (9) is also satisfied.
In step 2, we prepare a state |φ surf a on the auxiliary qubits that is in the +1-eigenspace of each stabiliser generator restricted to its support only on the auxiliary qubits. In other words we prepare the state |φ surf a satisfying where S := s a 1 , s a 2 , . . . , s a r . The generators of S are the same as those of the planar surface code up to local Clifford gates, and so we can prepare |φ surf a by encoding the planar surface code on the auxiliary qubits using the circuit from Section 6 and applying U V (U H ) to each vertical (horizontal) edge of the lattice in Figure 7, where This step can be verified by noticing that, under conjugation, U V maps X → Z and Y → X, and U H maps Y → Z and X → X, and so the generators of the surface code ( Figure 1) are mapped to generators of S . Note that after step 2, the combined state of the primary and auxiliary qubits satisfies for each generator s i = s p i ⊗ s a i of S, where the eigenvalue b i ∈ {−1, 1} is the parity of the primary qubits acted on non-trivially by s p i , satisfying s p i |φ det p = b i |φ det p . We say that b i is the syndrome of generator s i .
In step 3, we apply a circuit that instead ensures that we are in the +1-eigenspace of elements of S. This can be done by applying a Pauli operator R, with support only on the auxiliary qubits, that commutes with each generator s i if its syndrome b i is 1 and anti-commutes otherwise. Such a Pauli operator can always be found for any assignment of each b i ∈ {1, −1}, as shown in Figure 8: for each stabiliser generator s i , we can find a Pauli operator that we denote V (s i ) which, acting only on the auxiliary qubits, anti-commutes with s i while commuting with all other generators (note that the choice of V (s i ) is not unique). Taking the product of operators V (s i ) for all s i with syndrome b i = −1, we obtain a single Pauli operator that returns the state of our combined system to the +1-eigenspace of elements of S, such that it satisfies Equation (9). Furthermore, since steps 2 and 3 have acted trivially on the primary qubits, Equation (10) is still satisfied from step 1. Therefore, a Slater determinant in the compact mapping can be encoded using the O(L) depth unitary encoding circuit for the planar code as well as O(1) layers of single qubit Clifford gates. Note that the topologically protected logical qubit in the compact mapping is not used to store quantum information. As a result, we can prepare any state in the codespace of the surface code in step 2, and it does not matter if the Pauli correction R in step 3 acts non-trivially on the logical qubit. The problem of finding a suitable correction R in step 3 given the syndrome of each generator is essentially the same problem as decoding the XZZX surface code [3,53] under the quantum erasure channel (and where every qubit is erased). Therefore, any other suitable decoder could be used instead of using Equation (15), such as the variant of minimum-weight perfect matching used in Ref. [3], or an adaptation of the peeling decoder [17]. The encoding step for the surface code could instead be done using stabiliser measurements. However, since it is not otherwise necessary to measure the stabilisers of the mapping, the additional complexity of using ancillas, mid-circuit measurements and realtime classical logic might make such a measurementbased approach more challenging to implement on either NISQ or fault-tolerant hardware than the simple O(L) depth local unitary encoding circuit we present. Furthermore, the O(L) complexity of our encoding circuit is likely negligible compared to the overall complexity of most quantum simulation algorithms within which it could be used. Our encoding circuits for the surface code may also be useful for preparing states encoded in other fermion-to-qubit mappings. As an example, it has previously been observed that the Verstraete-Cirac transform also has a similar stabiliser structure to the surface code [48,50].

Discussion
We have presented local unitary circuits for encoding an unknown state in the surface code that take time linear in the lattice size L. Our results demonstrate that the Ω(L) lower bound given by Bravyi et al. [7] for this problem is tight, and reduces the resource requirements for experimentally realising topological quantum order and implementing some QEC protocols, especially using NISQ systems restricted to local unitary operations. We have provided a new technique to encode the planar code in O(L) time, as well as showing how an O(L) local unitary encoding circuit for the toric code can be found by enforcing locality in the non-local RG encoder. We unify these two approaches by demonstrating how local O(L)-depth circuits can be used to convert between the planar and toric code, and generalise our method to rectangular, rotated and 3D surface codes.
We also show that our unitary encoding circuit for the planar code can be used to encode a Slater determinant state in the compact mapping [19], which has a similar stabiliser structure to the surface code. This encoding circuit is therefore a useful subroutine for the simulation of fermionic systems on quantum computers, and it may be that similar techniques can be used to encode fermionic states in the Verstraete-Cirac transform, which has a similar stabiliser structure [50].
Using known local unitary mappings from one or more copies of the surface code, our results also imply the existence of optimal encoders for any 2D translationally invariant topological code, some 2D subsystem codes [6,57], as well as the 2D color code with and without boundaries [32]. As an explicit example, the subsystem surface code with three-qubit check operators can be encoded from the toric code using the four time step quantum circuit given in Ref. [8].
The circuits we have provided in this work are not fault-tolerant for use in error correction: a single qubit fault at the beginning of the circuit can lead to a logical error on the encoded qubit. Nevertheless, since our circuits have a lower depth than local unitary circuits given in prior work, we expect our circuits also to be more resilient to circuit noise (for example, our circuits have fewer locations for an idle qubit error to occur). Fault-tolerance of the encoding circuit itself is also not required when using it to prepare fermionic states or to study topological quantum order: for these applications, our circuits could be implemented using either physical qubits (on a NISQ device) or logical qubits on a fault-tolerant quantum computer. It would be interesting to investigate if our circuits could be adapted to be made fault-tolerant, perhaps for the preparation of a known state (e.g. logical |0 or |+ ). Further work could also investigate optimal local unitary encoding circuits for surface codes based on different lattice geometries (such as the hexagonal lattice [21]), or for punctured [20,42] or hyperbolic surface codes [10]. experimentally realise topological quantum order.

A.1 Review of the General Method
In this section we review the general method for constructing an encoding circuit for arbitrary stabiliser codes given in [15,23], and show how it can be used to find an encoding circuit for an L = 2 toric code as an example. We present the method here for completeness, giving the procedure in full and in the simplified case for which the code is CSS.
From a set of check operators one can produce a corresponding bimatrix M := L R Rows and columns represent check operators and qubits respectively. L ij = 1 indicates that check operator i applies X to qubit j as opposed to the identity, similarly for the right hand side R ij = 1 implies check operator i applies Z to qubit j. If both L ij = 1 and R ij = 1, then check operator i applies Y on qubit j.
A CSS code has check operators P n ∈ {I, X} ⊗n ∪ {I, Z} ⊗n , its corresponding bimatrix takes the form, A 0 0 B A and B have full row rank since they each represent an independent subset of the check operators. Labelling the rank of A as r, the rank of B is n−k −r.
Via row addition, row swaps and column swaps, the left and right matrices of this simplified form can be taken to standard form [23] without changing the stabiliser group of the code. The standard form of the bimatrix is then Where I,A 1 ,A 2 and D, I, E have (r), (n − k − r), and (k) columns respectively. We may also represent the set of logical X operators as a bimatrix with each row representing the logical X for a particular encoded qubit,X It is shown in [23] that the logicalX operator can be taken to the form In the CSS case the check operator bimatrix reduces to I A 1 A 2 0 0 0 0 0 0 D I E and the logical X bimatrix tō To produce a circuit which can encode state |c 1 . . . c k for any values of the c i one should find a circuit which applies logical operatorsX c1 1 . . .X c k k to the encoded |0 state 0 ≡ S∈S S |0 . . . 0 . Let F c be the operator corresponding to row c of bimatrix F . We denote by F c(m) the operator corresponding to F c , with the operator on the m th qubit replaced with identity, and then controlled by the m th qubit. Sincē the application of the X gates can be considered before applying the sum of stabiliser operations. Due to the I in the form ofX, we see that independently of |c 1 . . . c k we can imple-mentX where in the last line it is emphasised that since U 1 = 0, X i(j) acts trivially on the first r qubits. Next to consider is S∈S S = (I + M n−k ) . . . (I + M r ) . . . (I + M 1 ).
We denote the right matrix of bimatrix M as R. In standard form M i always performs X on qubit i and it performs Z on qubit i when R ii = 1, giving and so The remaining products can be ignored since they consist only of σ z operations and may be commuted to the front to act on |0 states. Given initially some k qubits we wish to encode, and some additional n − k auxiliary qubits, initialised in |0 , a choice of generators for the stabiliser group is The general circuit which transforms the initial generator set to the standard form bimatrix is given by, For CSS codes this reduces to In the simplified case all gates are either initial H gates or CN OT 's. We may write the circuit in two stages, performing first the H gates and controlledX gates.
· · · · · · · · · · · · · · · · · · In the general case stage 1 is identical but stage 2 takes the form · · · · · · · · · · · · · · · · · · Where Ω z consists of Z operations on some of the first r qubits and each M i(i) consists of controlled Z gates on some of the first r qubits and controlled Pauli gates on some of the following n − r qubits. In the case of the L = 2 toric code, with qubits labelled left to right and top to bottom the bimatrix is The circuit which encodes the above stabiliser set is It is important to have kept track of which column represents which qubit since column swaps are performed in bringing the matrix to standard form. Taking this into account gives the L = 2 circuit on the toric architecture.

A.2 Depth of the General Method
Any stabiliser circuit has an equivalent skeleton circuit [38] (a circuit containing only generic two-qubit gates, with single-qubit gates ignored) which after routing on a surface architecture will have at worst O(n) depth. The output of the general method for encoding a stabiliser code in fact already splits into layers of skeleton circuits. Stage 2 of the method applied to a CSS code has at worst r(n − r) controlled Pauli gates CP ij with i,j in {1 . . . r} and {r + 1 . . . n} respectively, CP ij is implemented before CP i j so long as i < i . Stage 2 then takes the form of a skeleton circuit and as such the number of timesteps needed is O(n) for surface or linear nearest neighbour architectures. Stage 1 has at most k(n − k − r) gates and also takes the form of a skeleton circuit. In the worst case scenario stage 2 includes, in addition to the CP gates, controlled Z gates CZ with targets on the first r qubits. As noted in errata for [23], i > j for any of the additional CZ ij in stage 2. All CZ ij can then be commuted to timesteps following all CP gates since each CP in a timestep following CZ ij takes the form CP mn with n > m > i > j. The circuit then splits into a layer of CP gates and a layer of CZ gates, each of which is a skeleton circuit, and so can be implemented in O(n) timesteps on surface and linear nearest neighbour architectures.  Figure 9: Encoding circuits for the L=2, L=3 and L=4 planar codes. Each edge corresponds to a qubit, each arrow denotes a CNOT gate pointing from control to target, and each filled black circle denotes a Hadamard gate applied at the beginning of the circuit. The colour of each CNOT gate corresponds to the time step it is implemented in, with blue, green, red, black, cyan and yellow CNOT gates corresponding to the first, second, third, fourth, fifth and sixth time steps respectively. The hollow circle in each of (a) and (b) denotes the initial unencoded qubit. The circuit in (c) encodes an L=4 planar code from an L=2 planar code, with solid edges denoting qubits initially encoded in the L=2 code.
B Additional planar encoding circuits B.1 Planar base cases and rectangular code In Figure 9 we provide encoding circuits for the L = 2, L = 3 and L = 4 planar codes, requiring 4, 6 and 8 time steps respectively. These encoding circuits are used as base cases for the planar encoding circuits described in Section 6. In Figure 10 we provide encoding circuits that either increase the width or height of a planar code by two, using three time steps.

B.2 Rotated Surface Code
In Figure 11 we demonstrate a circuit that encodes an L = 7 rotated surface code from a distance L = 5 rotated code. For a given distance L, the rotated surface code uses fewer physical qubits than the standard surface code to encode a logical qubit [5]. Considering a standard square lattice with qubits along the edges, a rotated code can be produced by removing qubits along the corners of the lattice boundary, leaving a diamond of qubits from the centre of the original lattice. The diagram in Figure 11 shows the resultant code, rotated 45 • compared to the original planar code, and  or |0 (green) state. The yellow squares denote a Z stabiliser on the four corner qubits, and the brown squares represent an X operator on the four corner qubits. The rotated code has additional stabilizers between states on along the edges. In the L = 5 code these are shown as a red arch (with Z and X stabilisers on the vertical and horizontal edges respectively), and the yellow and brown arches in the L = 7 code edge are Z and X stabilizers between the two edge qubits.
with each qubit now denoted by a vertex rather than an edge. For a distance L code the rotated surface code requires L 2 qubits compared to L 2 + (L − 1) 2 for the planar code. The encoding circuit in Figure 11 takes 4 steps to grow a rotated code from a distance L = 5 to L = 7. This is a fixed cost for any distance L to L + 2. To produce a distance L = 2m code this circuit would be applied repeatedly m + O(1) times to an L = 2 or L = 3 base case, requiring a circuit of total depth 2L + O (1). The circuit in Figure 11 can be verified by using Equation (1) to see that a set of generators for the L = 5 rotated code (along with the single qubit Z and X stabilisers of the ancillas) is mapped to a set of generators of the L = 7 rotated code, as well as seeing that the X and Z logicals of the L = 5 code map to the X and Z logicals of the L = 7 rotated code.

C Renormalisation Group encoder C.1 Toric Code Encoder
Applying the Gottesman encoder to the toric code, as shown in Appendix A, and then enforcing locality using SWAP gates, gives the following encoding circuit for the L = 2 toric code that requires 10 time steps: where the qubits are numbered 0 . . . 7 from top to bottom. This circuit encodes the initial unknown qubit states |ψ 0 and |ψ 1 into logical states ψ 0 and ψ 1 of an L = 2 toric code with stabiliser group generators X 0 X 1 X 2 X 6 , X 0 X 1 X 3 X 7 , Equipped with an L = 2 base code emulated as the central core of a 4×4 planar grid, where the surrounding qubits are initially decoupled +1 Z eigenstates, one can apply the local routing methods of Appendix C.2 to obtain the initial configuration as depicted in Figure 12. The ancillae qubits are then initialised as |0 or |+ eigenstates as depicted in Figure 4(a) by means of Hadamard operations where necessary, before the circuit is implemented through the sequence of CNOT gates as depicted in Figure 4(a)-(c).
By recursive application of Equation (1), it is seen that the circuit forms the stabiliser structure of an L = 4 toric code on the planar architecture. Proceeding inductively, one can exploit the symmetry of a distance L = 2 k toric code to embed it in the centre of a 2L × 2L planar grid, "spread-out" the core qubits in time linear in the distance, and ultimately perform the L = 2 → L = 4 circuit on each 4 × 4 squarely-tesselated sub-grid.

C.2 Routing circuits for enforcing locality
To enforce locality in the Renormalisation Group encoder, which encodes a distance L toric code, one can use SWAP gates to "spread out" the qubits between iteration k and k + 1, such that all of the O(1) time steps in iteration k+1 are almost local on a 2 k+1 ×2 k+1 region of the L × L torus. By almost local, we mean that the time step would be local if the 2 k+1 ×2 k+1 region had periodic boundary conditions. Since at each iteration (until the final one) we use a region that is a subset of the torus, we in fact have a planar architecture (no periodic boundaries), and so it is not possible Figure 12: Initial outwards spreading of qubits in a distance 4 toric code to prepare for the encoding of a distance 8 toric code. Solid black and unfilled nodes represent the routed qubits of the distance 4 code, and the ancillae respectively. One then executes the subroutine of Figure 4 in each of the four 4x4 quadrants. This procedure generalises inductively for any targeted distance 2 k toric code.
to simultaneously enforce locality in all of the O(1) time steps in an iteration k < log L − 2 of the RG encoder, which are collectively local on a toric architecture. Thus it is necessary to emulate a toric architecture on a planar one. In a time step in iteration k, this can be achieved by using 3(2 k − 1) time steps to move the top and bottom boundaries together (using SWAP gates) before applying any necessary gates which are now local (where the factor of three comes from the decomposition of a SWAP gate into 3 CNOT gates). Then 3(2 k − 1) time steps are required to move the boundaries back to their original positions. The identical procedure can be applied simultaneously to the left and right boundaries. Thus there is an overhead of 3(2 k+1 − 2) to emulate a toric architecture with a planar architecture. Starting from L = 2 and ending on a size L code gives an overall overhead to emulating the torus of 6( log 2 (L)−2 i=1 2 i+1 − 2) = 6L − 12 log 2 L, since from Figure 4 it can be seen that opposite edges need be made adjacent two times per iteration to enforce locality in it. Additionally, the time steps within each iteration must be implemented. Noticing that the red CNOT gates in Figure 4(b) can be applied simultaneously with the gates in Figure 4(a), this can be done in 6 time steps, leading to an additional 6 log 2 (L) − 6 time steps in total in the RG encoder.
It is key to our routine to be able to "spread out" the qubits between each MERA step. We now show that this can be achieved in linear time by routing qubits through the planar grid. We firstly consider a single step of moving from a 2 k to a 2 k+1 sized grid.
Our first observation is that while the qubits lie on the edges of our 2 k × 2 k grid, one can subdivide this grid into one of dimensions (2 k+1 + 1) × (2 k+1 + 1), such that the qubits lie on corners of this new grid, labelled by their positions (i, j) with the centre of the grid identified with (0, 0). Under the taxicab metric we can measure the distance of qubits from the centre as M i,j := |(i, j)| = |i| + |j| and one can check that qubits only ever lie at odd values of this metric, essentially forming a series of concentric circles with M i,j = 2n + 1, n ∈ N. See Figure 13.