Code-routing: a new attack on position verification

The cryptographic task of position verification attempts to verify one party's location in spacetime by exploiting constraints on quantum information and relativistic causality. A popular verification scheme known as $f$-routing involves requiring the prover to redirect a quantum system based on the value of a Boolean function $f$. Cheating strategies for the $f$-routing scheme require the prover use pre-shared entanglement, and security of the scheme rests on assumptions about how much entanglement a prover can manipulate. Here, we give a new cheating strategy in which the quantum system is encoded into a secret-sharing scheme, and the authorization structure of the secret-sharing scheme is exploited to direct the system appropriately. This strategy completes the $f$-routing task using $O(SP_p(f))$ EPR pairs, where $SP_p(f)$ is the minimal size of a span program over the field $\mathbb{Z}_p$ computing $f$. This shows we can efficiently attack $f$-routing schemes whenever $f$ is in the complexity class $\text{Mod}_p\text{L}$, after allowing for local pre-processing. The best earlier construction achieved the class L, which is believed to be strictly inside of $\text{Mod}_p\text{L}$. We also show that the size of a quantum secret sharing scheme with indicator function $f_I$ upper bounds entanglement cost of $f$-routing on the function $f_I$.


Background
In the cryptographic task of position verification [1,2], a prover (Alice) and verifier (Bob) interact to establish the spatial location of the prover. To do this, Bob issues Alice a challenge, which Bob believes can only be accomplished if Alice applies quantum or classical operations within the spacetime region of interest. The challenge is a relativistic quantum task [3], with quantum and classical systems input at one set of spacetime locations and another set of input and output systems returned at a second, later set of spacetime points.
We illustrate the typical position verification set-up in fig. 1a. At spacetime locations c 0 and c 1 , which are spatially separated but occur at the same time, inputs A 0 and A 1 are transmitted by Bob and sent towards the grey shaded region. Then, Alice should process those inputs in some way and return the output systems B 0 and B 1 to spacetime locations r 0 and r 1 . To complete this, Alice can either act honestly or dishonestly. 1 If behaving honestly, Alice enters the shaded spacetime region, receives both the inputs and locally acts on them, as shown in fig. 1b. If behaving dishonestly, Alice sends agents to either side of the grey region, intercepts both transmissions, and then acts in the non-local form shown in fig. 1c. This involves local actions on each side of the region, possibly making use of preshared randomness or entanglement, and a single, simultaneous round of communication -a computation performed in this form we call a non-local (quantum) computation. For a given choice of input state and transformation expected to be performed by Alice, acting in this non-local form may be sufficiently challenging so as to rule out this possibility. If so, then Bob has successfully verified that Alice acts within the specified region.
Suppose that the input and output systems are all classical. For concreteness, label the input string at c 0 by x, and the input string at c 1 by y. Then the outputs at r 0 and r 1 are some functions f 0 (x, y) and f 1 (x, y) of the input strings. It is straightforward to see Figure 1: (a) A relativistic quantum task. Time proceeds upwards in the diagram, and the horizontal direction is a spatial dimension. Light rays follow lines with slope ±1. Input systems A 0 and A 1 are received at spacetime locations c 0 and c 1 , respectively, and B 0 and B 1 should be returned at r 0 and r 1 , respectively. The inputs and outputs should be related by some designated channel N A0A1→B0B1 . Bob, who issues the challenge, wishes to choose the channel such that Alice is forced to do computations within the gray spacetime region. (b) Completing the task in a local form. The yellow circle represents a channel acting on input systems A 0 and A 1 , and producing output systems B 0 and B 1 . Alice acts within the gray region, corresponding to an honest strategy. (b) A computation happening in the non-local form. A 0 is interacted with the L system, and A 1 with the R system, where Ψ LR is entangled. Then, a round of communication is exchanged, and a second round of operations on each side are performed. All operations happen outside of the spacetime region, corresponding to a cheating strategy.
[1] that in this fully classical case it is always possible for Alice to cheat by completing the relativistic task in the form shown in fig. 1c. To do so, the strategy is to copy the inputs x, y, then send one copy and keep the other so that x and y are both held at both output locations. Then, f 0 (x, y) is computed at r 0 and f 1 (x, y) at r 1 , completing the task. Unlike classical information, quantum information cannot be copied [4]. Inspired by this, [5,6] suggested using position verification schemes with quantum input and output systems. It was realized however that even in the quantum case all relativistic quantum tasks can be completed in the non-local, cheating form shown in fig. 1c, see [2,7,8]. This establishes that position verification cannot be made unconditionally secure, at least within the context of quantum mechanics in a fixed spacetime background and without placing assumptions on the entanglement available to an attacker.
In the absence of unconditional security, we can look for assumptions under which the scheme may be considered secure. For some relativistic quantum tasks, it can be shown that all cheating strategies require large amounts of entanglement. Given this, one can introduce a security model that assumes a bounded amount of entanglement is shared, and then prove security of a position verification scheme by establishing that entanglement in excess of this bound is required to complete a given quantum task.
Ideally, the relativistic quantum task used in the context of position verification is easy to complete in the honest strategy, and as hard as possible to complete in the dishonest form. One well studied proposal is f -routing, which takes the following form. At c 0 , a quantum system Q of dimension d is given, along with a classical string x of length n. At c 1 , a classical string y of length n is given. As an output, Alice is required to return system Q at r f (x,y) , where f is some fixed function mapping strings of length 2n to bits. Notice that to complete the f -routing task honestly Alice can bring Q, x and y into the spacetime region, compute f , then redirect Q based on the outputs. Thus the quantum part of the strategy is almost trivial.
Recently Bluhm, Christandl, and Speelman, [9] proved the following statement. Pick a random function f . Then with high probability, any cheating strategy to complete the corresponding f -routing task requires a shared resource system with a dimension that grows with n. Thus by increasing n, the honest strategy involves a larger classical computation, but the dishonest strategy involves manipulating larger quantum systems. Assuming classical computations are "easier" in some appropriate sense than storing quantum systems, we can establish security of the scheme.
Entanglement cost in the f -routing task exhibits an interesting relationship to classical complexity theory. One interesting attack on f -routing is the "garden-hose" protocol [10,11,12]. In that protocol, the number of EPR pairs needed to perform f -routing non-locally, call it GH(f ), is related to the memory cost of computing f on a Turing machine.

SPACE(M ).
We note here that α and β are arbitrary functions; they appear because Alice may locally manipulate her input strings before beginning a protocol. We refer to application of these functions as pre-processing.
This connection between the garden-hose model and complexity theory is also constructive: an algorithm for computing f can be turned into a non-local computation using 2 O(SPACE (2) (f )) entanglement, and a non-local computation in the garden-hose model can be turned into an algorithm for computing f , with memory cost given by log GH(f ). This connection also suggests proving strong lower bounds on entanglement in f -routing should be challenging, as we would obtain lower bounds on space complexity as a consequence.
The class of functions that can be implemented efficiently using the garden hose protocol is related to L, those functions that can be computed in log-space. However, the appearance of pre-processing means the efficiently computable functions are instead given by the class L (2) , defined as follows, (2) Note that here L denotes the class of functions computable in space logarithmic in n, the length of the strings x and y (not the length of α(x) and β(y)). This is the class of functions for which we can complete the f -routing task non-locally using polynomial entanglement within the garden-hose protocol. We can analogously define the class P (2) , polynomial time when allowing pre-processing, where the P inside the definition refers to functions with runtime polynomial in n, the length of x and y. One consequence of the garden-hose protocol's connection to complexity theory is that certain explicit entanglement lower bounds are expected to be hard to prove. For example, given a function f ∈ P (2) , if one showed f requires super-polynomial entanglement, then we would learn that L (2) ⊊ P (2) . Since from the definitions above L = P implies L (2) = P (2) , we have that L (2) ⊊ P (2) implies L ⊊ P . Proving that L ⊊ P however is a longstanding and difficult problem in computer science. Recently, a relationship between position-based cryptography and quantum gravity has been highlighted [13,14]. As we discuss further in [15], in that context there is a tentative expectation coming from the quantum gravity side that entanglement cost in non-local computation should be related to the complexity of the corresponding local computation. From this perspective, the complexity-entanglement relationship exhibited in the gardenhose protocol is especially interesting, and we were motivated to further study f -routing and its relationship to complexity due to that connection.
The possible relationship between complexity and entanglement in non-local computation is also of practical interest in the context of position verification. For instance, consider the security setting in which we assume an attacker has bounded entanglement, but do not otherwise restrict their resources. In this setting we are interested in functions which require large entanglement to implement non-locally. At the same time, the geometry of a position-verification scenario requires the computation be implementable quickly when performed locally. 2 If the function f has exponential complexity, the honest party may not be able to compute it within the needed amount of time. Because it uses a randomly chosen (and hence high complexity) function the Bluhm, Christandl, and Speelman result [9] faces this obstruction to realizing a practical and secure position verification setting. For this reason, it is important to understand the entanglement cost for implementing low-complexity functions.

Summary of results
In this paper we give a new strategy for completing the f -routing task non-locally, which we call "code-routing". The basic strategy of the protocol is to encode the input system Q into a quantum secret sharing scheme whose access structure is related to the function f . The shares of the scheme are then routed on simple functions of single input bits. Compared to the existing garden-hose protocol, code-routing uses no more entanglement, and probably less. To understand why we make use of a connection between the coderouting strategy and complexity theory. We also use the code-routing strategy to establish a new relationship between entanglement cost in f -routing and the size of quantum secret sharing schemes. Throughout the work, we work with p-dimensional quantum systems, which we call 'qupits', with p any prime. 3 Calling the minimal entanglement required to f -route E(f ), we show where SP p, (2) and SP p (M ) is the minimal size of a span-program over the field Z p that computes M . The complexity class of functions that can be computed with polynomial-sized span programs is Mod p L (see section 3.1 for a definition), so that here the functions for which we can perform f -routing using polynomial entanglement is Mod p L (2) , where again the added subscript accounts for performing local pre-processing of the inputs. To understand the relationship between entanglement cost in the garden-hose protocol and code-routing, we note first that 4 L ⊆ Mod p L, and consequently L (2) ⊆ Mod p L (2) . Thus, we can perform f -routing efficiently for at least those functions that can be efficiently performed in the garden-hose protocol. Further, it is believed that L ⊊ Mod p L. We recall the evidence for this in section 3.1. Consequently in considering the classes L (2) and Mod p L (2) , a strictly larger class of functions can be used to compute the non-local part of f . We believe that as a consequence L (2) ⊊ Mod p L (2) . We explain our intuition for this but cannot show it.
A further consequence of our protocol is a relationship between the size of quantum secret sharing schemes and entanglement requirements in f -routing. In particular, a quantum secret sharing scheme records a secret, Q, into a set of shares {v 1 , ..., v n } such that some subsets recover Q and others reveal nothing about it. The size of a secret sharing scheme is the sum of the log dimension of all the shares. The structure of the scheme is captured by the indicator function, which is defined as a map from subsets of shares to bits, and is 0 when the subset reveals nothing about the secret and 1 when the subset reveals the secret. Ideally, one constructs a secret sharing scheme with as small of a size as possible for a given indicator function.
When considering f -routing tasks where f can be realized as an indicator function, we build a code-routing scheme that shows the entanglement requirement E(f ) is upper bounded by the size of any secret sharing scheme with f as its indicator function. This can also be understood as a constraint on the size of secret sharing schemes.
It is also interesting to ask if Mod p L (2) is the largest class of functions that can be completed using code-routing protocols with polynomial entanglement. Our protocol that achieves this is a special case of the most general possible code-routing construction, in particular it restricts to a class of secret sharing schemes constructed by Smith [16]. Assuming only those codes are used, and under further constraints on the protocol, we give some partial converse results. For code-routing protocols where Smith codes are used, we can show their complexity is within P (2) . When restricting to protocols that concatenate Smith codes to only O(1) depth, we show their complexity is within Mod p L (2) . For code-routing protocols using arbitrary codes with O(1) shares, we show their complexity is within L (2) . Throughout, we have to assume that a certain measure of the size of the protocol is related polynomially to the entanglement used. These results eliminate some directions in which one can try to use a code-routing protocol to perform f -routing on functions of larger complexity, and highlight the remaining possibilities. To describe the f -routing task, it will be helpful to consider Alice, who carries out the protocol to be an agency with several agents. Alice's agents co-operate with one another to complete the task. Similarly, Bob is an agency with several agents, who may move through spacetime along different trajectories. For convenience, we will say for example that Bob gives Alice system A at spacetime location c 0 . Somewhat more precisely, this means that an agent of Bob's, who is located at c 0 , gives an agent of Alice's the system A.
The routing task is defined as follows.
Definition 1 An f -routing task is defined by a Boolean function f : {0, 1} 2n → {0, 1}. The task is carried out by two agencies, Alice and Bob. At spacetime location c 0 Bob gives Alice a quantum system Q and a classical string x of length n. At spacetime location c 1 Bob gives Alice a string y. Strings x and y are drawn from the uniform distribution, while Q is in a maximally entangled state Ψ + QQ with reference systemQ held by Bob. Alice returns a quantum system B 0 at location r 0 and B 1 at R 1 . Bob measuresQB f (x,y) to test if it is in the state Ψ + , and Alice completes the task successfully if the test succeeds. When convenient, we will refer to Alice's agent at c 0 as Alice 0 , and Alice's agent at c 1 as Alice 1 . As well, it is sometimes convenient to refer to c 0 and r 0 together collectively as 'the left' and c 1 and r 1 together as the 'the right'.
To complete a routing task, the simplest strategy is to bring x, y and Q together, compute the function f , and then direct Q based on the result of the computation. To use an f -routing task to verify if Alice performs non-trivial operations within a spacetime region R, the points c 0 , c 1 , r 0 , r 1 should be arranged such that performing this local strategy requires entering R. In particular, we define the region Here J + (p) is the future light cone of p, meaning the set of all points q such that information can travel from p to q without moving faster than light, and J − (p) is the past light cone of p, meaning the set of all points q such that one can travel from q to p without travelling faster than the speed of light. This is the region in which the input to the local computation of f are available, and the outputs from the computation can still reach the output points. Consequently, we choose c 0 , c 1 , r 0 , r 1 such that J 01→01 ⊆ R when we wish to verify Alice can perform computations within R. To perform the routing task non-locally, the best known strategy is the garden-hose protocol [10]. It involves sharing EPR pairs between c 0 and c 1 , then doing a set of Bell measurements on pairs of entangled particles. Which measurements are performed depends on the values of the strings x and y. The measurement outcomes are then communicated to both of the output locations. If the mappings from strings x, y to a set of measurements on both sides is chosen correctly, it will be possible to recover the system Q at r f (x,y) . We give simple examples of computing a NOT and AND function in fig. 3. As discussed in the introduction, the entanglement cost of completing the f -routing task using the garden-hose protocol is controlled by the space complexity of f .
Another possible attack is given in [8]. This attack also works for arbitrary quantum tasks. Applied to f -routing, it has exponential in n entanglement cost for any choice of function f .

Code-routing protocols
Error-correcting codes are a standard tool appearing throughout quantum information theory -here we consider their use in performing the f -routing task. Because only two parties (an agent on the left and on the right) are involved in a non-local computation, it is unclear why error-correcting codes should be related to non-local computation. However, we are motivated to do this because of a recent connection [13] between non-local quantum computation and the AdS/CFT correspondence [17,18]. Error-correction plays an important role in the AdS/CFT correspondence, suggesting a connection between nonlocal computation and error-correction. We study a family of f -routing protocols that exploit error correction, which we call code-routing protocols. After giving the general form of any such protocol, we discuss a particular class of codes that expands the set of computations performable using polynomial entanglement to Mod p L (2) , a complexity class which is known to be at least as large as L (2) , and is probably larger.
The basic structure of a code-routing protocol involves recording Q into an errorcorrecting code, then sending the shares of that code to the left or right based on the input variables. We can also carry out garden-hose type strategies on individual shares, or record those shares into subsequent codes, including choosing which encoding to use based on the input variables.
The simplest example of a code-routing protocol, which we will use as a subroutine in subsequent constructions, is 'unit-routing'. The functionality of the unit-routing protocol is to send a share v i to the side labelled by a bit z j . We explain how to perform the unit-routing protocol in fig. 4.
We describe the most general form of a code-routing protocol below. 5 Definition 2 Code-routing protocol: A code-routing protocol is defined by two maps C 0 [x] and C 1 [y], each mapping from input strings of length n to a tuple, The combined outputs we refer to as the protocol tape. Each S i corresponds to one encoding, teleportation, or 'unit-routing' of a local share. We denote it as a tuple with v i a label for an input share, {w j i } a set of output shares, and T i a description of an encoding, teleportation, or 'unit-routing'. Define n i = |{w j i }| to be the number of output shares associated with S i . Then: • When n i = 0, T i describes a unit-routing or keep/send instruction. For a unit-routing T i , will be the label of a single bit of a(x) or b(y), or its negation. For a keep/send instruction, T i will be a 0 or 1 indicating that the share should be brought to r 0 or r 1 .
• When n i = 1, T i will be empty, and the tuple (v i , w 0 i , ∅) describes a teleportation from the v i system onto the w 0 i system. 6 • When n i > 1, T i describes an encoding into an error-correcting code, with the w j i systems the output systems of the encoding procedure. Figure 4: Illustration of the unit-routing protocol. The effect of the protocol is to bring the share v to the side labelled by the input bit. a) For an input bit z = x i held by Alice 0 , who holds share v, the share is sent to c 1 during the communication round iff z = 1. b) With v at c 0 but input bit z = y j held at c 1 , the share is first measured in the Bell basis with one end of an EPR pair that has been shared between c 0 and c 1 . After the measurement, the systems at c 1 holds the information on v up to a Pauli correction. Call this system v ′ . During the communication round, Alice 1 sends v ′ to c 0 if z = 0, and keeps it if z = 1. Simultaneously, Alice 0 maintains a copy of her measurement outcome and sends a copy to the right. On whichever side v ′ has been brought to, the local agent can undo the Pauli correction and recover v. Notice that the bit z could also be the NOT of one of the input bits received by Alice 0 or Alice 1 . Similar protocols are used when v is held by Alice 1 .
Alice 0 and Alice 1 carry out the code-routing protocol by computing C 0 [x] and C 1 [y], then encoding, teleporting, or unit-routing each share according to the pattern described by the protocol tape.
Code-routing includes the garden-hose protocol as a special case: if no systems are put into codes, the remaining protocol amounts to a set of choices about which pairs of entangled systems should be measured in the Bell basis, as in the garden-hose protocol. This shows code-routing uses at most as much entanglement as the garden-hose. More generally, including non-trivial encodings allows a larger class of strategies.
To understand code-routing, it will be helpful to begin with simple examples and build up to more elaborate constructions. Some basic examples of code-routing protocols are shown in fig. 5. There, we f -route on the AND and OR functions using an erasure code on 3 shares that corrects one erasure error. The protocols for AND and OR given here can be compared to the garden-hose strategies for computing the same functions in fig. 3.
One convenient property of the code-routing strategy is that composition of functions is implemented in a simple way. To see this, consider a simple example, which is easy to generalize. Consider the function f (x, y) = AN D(N OT (x), OR(x, y)). To execute this in a code-routing protocol, one can use the code shown in fig. 5c. Notice that we concatenate codes according to the pattern given by the Boolean formula for function f (x, y). This generalizes to any Boolean formula, although we must use DeMorgans' laws to move the NOT gates to the input layer. This shows that the entanglement cost for code-routing on a function f is bounded above by the formula size of f , where by formula size we mean the number of inputs to the formula, counted with repetition. 7 Building on the AND and OR examples, we can replace the simple threshold code with other, more structured examples. An interesting class of examples is constructed from quantum secret sharing schemes, which we review briefly before describing the protocol.
A quantum secret sharing scheme is a quantum error-correcting code with the additional feature that collections of subsystems are either authorized, meaning they can be used to recover the encoded state, or unauthorized, meaning they reveal no information about the state. The set of authorized sets for a given secret sharing scheme is known as its access structure. Call the shares produced by the secret sharing scheme {v i } i . Then the scheme's Figure 5: Some simple code-routing protocols. The map E takes in the Q system and records it into a 3 share secret sharing scheme where any 2 shares recover the secret. In the protocol, Alice 0 , who initially holds Q, performs the encoding map E. The lower boxes indicate the unit-routing protocol should be implemented on the attached shares. a) Code-routing protocol for computing AN D(x, y). The protocol tape describing this protocol consists of the tuples y)). This method of concatenating codes to generalizes to arbitrary Boolean formulas. The entanglement cost is bounded above by the formula size.
access structure defines a corresponding indicator function f I according to All valid indicator functions satisfy two constraints. First, the no-cloning theorem implies no two disjoint subsets can recover the state. At the level of the indicator function, this is expressed as f (z) = 1 =⇒ f (z) = 0. Second, adding additional shares to a set never prevents recovery, which implies f (z) is monotone. 8 In [19], it was shown that whenever the indicator function is no-cloning and monotone, it is possible to construct a corresponding quantum secret sharing scheme. Finally, define the size of a quantum secret sharing scheme to be the sum of the log dimension of all the shares. For shares built from qubits, this is the total number of qubits the secret is encoded into. Using a code-routing protocol based on a single encoding of Q into a quantum secret sharing scheme, we can prove the following theorem.
Theorem 3 Consider an f -routing task where f is a valid indicator function. Then the entanglement cost of completing the routing task for f is upper bounded by the size of any quantum secret sharing scheme that has f as its indicator function.
Proof. Construct an f -routing protocol as follows. On the left, record Q into a quantum secret sharing scheme with shares {v 1 , ..., v 2n } and indicator function f (x, y). In particular, use the isometric extension of the encoding map, and have Alice 0 hold the purifying system R. Then, for 1 ≤ i ≤ n carry out the unit-routing protocol on each share v i with x i as input. For n + 1 ≤ i ≤ 2n unit-route share v i on y i . We will show this procedure correctly completes the f -routing task, and has entanglement cost upper bounded by the size of any secret sharing scheme with indicator f .
For correctness, notice that by construction Alice 1 obtains the set of shares K(x, y) ≡ z i :z i =1 v i , and Alice 0 holds the purification of K(x, y) (consisting of the remaining shares plus R). If f (x, y) = 1, by construction we have that K(x, y) is authorized, so Alice 1 recovers Q, which is correct. If f (x, y) = 0 Alice 1 receives an unauthorized set of shares. This ensures all systems held by Alice 1 reveal nothing about Q. Since Alice 0 holds the purifying system, by decoupling [20,21] we have that Alice 0 can recover Q. Again this is correct.
To understand the entanglement cost of this protocol, notice that the unit-routing of share v i for i > n requires log 2 d v i EPR pairs. Unit-routing on shares v i for i ≤ n has no entanglement cost, since the needed bits x i are held locally. The total entanglement cost is just the entanglement cost of all the unit-routings, giving The right hand side is just the size of the secret sharing scheme used, so we are done. For a given indicator function, the most efficient quantum secret sharing scheme is the one due to Smith [16]. In particular, Smith's scheme has size O(mSP p (f )), where mSP p (f ) is the size of a monotone span program over Z p that computes f . We define span programs in appendix A. This shows that for indicator functions the entanglement cost of f -routing is upper bounded by monotone span program size.
Next, we continue to progress towards more elaborate code-routing protocols, which will allow us to do code-routing for arbitrary functions, not just indicator functions. In particular, we will introduce unit-routings that direct a share based on the negation of one of the input bits, rather than an input bit directly, which will allow us to route on non-monotone functions. As well, we will route on functions which violate the nocloning property by realizing them as restrictions of functions which do have the no-cloning property. Combining these tools we prove the following theorem.
Theorem 4 Using a code-routing protocol, the routing task can be completed for any function f using a resource state consisting of O(SP p, (2) and SP p (h) is the size the smallest span program over the field Z p computing h.
In the next section we show that span program size is no larger than the entanglement cost in the garden-hose protocol, and given some complexity theoretic assumptions is smaller.
Towards proving this theorem, we build a routing protocol in the following way. We show that f can be expressed as ¬z, b), and f I an indicator function. We state this in the next lemma.
• f I is a valid indicator function • g acts on the first m bits of its input by copying each bit z i and negating one copy, denotes the minimal size of a span program over Z p computing h, and mSP p (h) the size of a monotone span program computing h.
We prove this lemma in appendix B.
Using this lemma, we are ready to prove theorem 4.
Proof. (Of theorem 4) Let the function we will perform the routing task on be f (x, y). We can first allow Alice 0 and Alice 1 to apply local functions to their strings x and y, producing new strings α(x) and β(y). These are chosen, along with a function h, such that , with h I a valid indicator function, and g(z) mapping m + 1 bits to 2m + 1 bits. For the indicator function h I , use the construction in Ref. [16] to find an encoding map E Q→V which prepares a secret sharing scheme with access structure corresponding to h I .
The protocol is as follows. After receiving Q, Alice 0 applies the isometric extension of the encoding channel, call it V E Q→V E . This produces output systems v i , 1 ≤ i ≤ 2m + 1, and E. The environment system E is retained by Alice 0 . Then, Alice 0 and Alice 1 carry out the unit-routing protocol (see fig. 4) to bring share v i to Alice g(z) i , where by g(z) i we mean the ith bit of g(z). 9 Note that we always take z 2m+1 = b = 1 and g to always act trivially on this bit, so that share v 2m+1 is always sent to Alice 1 .
Next we verify that this protocol works correctly, in that Q will be recovered on Alice f (z) 's side. Consider that Alice 1 holds all those shares v i such that g(z i ) i = 1. If this is an authorized set, she will be able to recover Q. By design, this occurs exactly when h ′ (z, 1) = h I (g(z, 1)) = 1, and by construction h ′ (z, 1) = h(z), so this is correct. Alternatively if the set of shares v i such that g(z i ) = 1 is unauthorized, then Alice 1 's systems reveal nothing about the encoded state. Because Alice 0 performed the encoding procedure isometrically and retained the environment, decoupling ensures that Alice 0 can now recover the state. This occurs exactly when h ′ (z, 1) = h I (g(z, 1)) = 0, so h(z) = 0, and again this is correct.
Finally we determine the entanglement cost of performing this protocol. All the entanglement use occurs in teleporting shares v i , 2|α| < i ≤ 2m from Alice 0 to Alice 1 , which occurs as part of the unit-routing protocol. The required entanglement depends on the size of the shares v i , which in turn depends on the details of the secret-sharing scheme construction. Specifically, the protocol can be performed using not more than maximally entangled pairs of qupits. For the construction of Ref. [16], this is at most (2mSP p (h I ) + 1). From lemma 5 we have also that mSP p (h I ) ≤ SP p (h) + 1, completing the proof.
3 Entanglement and complexity in code-routing

Lower bounds on efficiently achievable complexity
In the last section we saw that the code-based protocol can carry out a routing task using at most O(SP p, (2) (f )) maximally entangled pairs of qupits, where SP p, (2) (f ) is the minimal size of a span program over Z p (with p prime) that computes the non-local part of f . To capture the set of functions that can be performed using reasonable amounts of entanglement with this strategy, we define the following complexity classes.
Definition 6 For prime p, PSP p is the set of families of functions f n : {0, 1} n → {0, 1} that can be computed using span programs over the field Z p of size polynomial in n.
Theorem 4 establishes that the routing task can be completed with polynomial EPR pairs for a function family {f n } at least when it is in the class PSP p, (2) , for any prime p. This gives that the class of functions efficiently implementable in the code-routing strategy is at least ∪ prime p PSP p, (2) . We are interested in the relationship between this class and L (2) , which is the class of functions that can be computed non-locally in the garden-hose model (the most efficient previously known protocol) with polynomial entanglement. In the next two sections we give evidence that L (2) ⊊ PSP p, (2) , so that code-routing improves on the garden-hose model. 10

L and PSP p
We will start by considering the classes without local pre-processing of the inputs, L and PSP p . It is believed that L ⊊ PSP p . To understand why, we first need to introduce a few related complexity classes, NL, UL, and Mod p L.
To understand these classes, recall the notion of a non-deterministic Turing machine. Such a machine may, at each step, choose to follow one or more computational paths. For a "yes" instance, we just require that at least one of these paths be accepted. This contrasts with a deterministic machine, which follows exactly one path. For example, consider the directed graph connectivity problem:

DAG
• Input: A directed acyclic graph G, and a designation of two nodes in the graph, called s and t. • Output: 1 if there exists at least one path from s to t in G, 0 otherwise. Starting at node s, a non-deterministic machine can solve DAG by following every outward edge from s, and every outward edge from each subsequent node, etc. The machine accepts if any of these computational branches reaches t. We can restrict the computational power of the machine by requiring each branch, separately, run in a restricted amount of time or use a restricted amount of memory.
NL is the class of decision problems solvable on a non-deterministic Turing machine with O(log n) memory, where n is the length of the input. UL is the class of decision problems solvable on a non-deterministic Turing machine with logarithmic memory, but requiring that exactly one branch accept on "yes" instances, and zero branches accept on "no" instances. Finally, recall that L is the class of decision problems that can be decided in O(log n) space on a deterministic Turing machine. It is clear that L ⊆ UL, because a deterministic machine is a special case of a non-deterministic one, and the deterministic machine has just one computation path, and so in particular one accepting path.
It's also immediate that UL ⊆ NL, because machines with one accepting path are special cases of the general non-deterministic one.
Finally, we consider Mod p L, for p prime. This has an unusual definition, but turns out to capture the complexity of a number of natural problems. Mod p L is the class of decision problems which can be solved by running a non-deterministic Turing machine and outputting "yes" when the number of accepting paths in that machine is non-zero mod p, and outputting "no" otherwise. An example of a problem in this class is the following. More relevantly, Ref. [22] proved that Mod p L includes many natural linear algebra questions over the field Z p , including inverting and powering matrices, calculating the rank of a matrix and others. To relate this to our earlier classes, note that a UL machine on "yes" instances has one accepting path, so in particular 1 mod p accepting paths, so any problem in UL can be decided in Mod p L so that UL ⊆ Mod p L. Together with L ⊆ UL, this also implies that L ⊆ Mod p L as mentioned in the introduction. In Ref. [23], it was pointed out that running a span program of polynomial size is in Mod p L, and in fact every problem in Mod p L can be reduced in an efficient way to running a span program. Consequently, we have PSP p = Mod p L.
As a consequence of this, it is also true that a span program with d rows can be computed by running a Turing machine with O(log d) memory, and outputting 0 iff the number of accepting paths is non-zero mod p.
Using this, we can relate the classes L and PSP p according to It is also believed that L ⊊ NL, and that UL = NL. Assuming both these statements, we would have that L ⊊ PSP p . We motivate these beliefs below. First consider the claim L ⊊ NL. This is widely believed, similar to the belief that P ⊊ NP. It amounts to the statement that allowing a log space Turing machine to follow many computational paths at once adds power. One line of evidence for L ⊊ NL is the theory of NL-completeness. Many problems [24] are known to be NL-complete, meaning any problem in NL can be mapped to them using a log space mapping. If L is equal to NL, then all of these problems have a log space solution, but no such solution is known for any of them. Concretely, the DAG problem described above is NL-complete. This means the claim that L ⊊ NL amounts to the statement that we cannot solve this problem in log space without non-determinism.
The second claim is that UL = NL. As mentioned above, it is immediate that UL ⊆ NL, so it remains to understand the evidence for NL ⊆ UL. This was discussed in Ref. [25,26], where they pose the question in terms of the DAG problem. We summarize their argument briefly. First notice that since DAG is NL-complete, if we can show it is in UL we are done. The problem then is to, given a directed graph G, define a non-deterministic Turing machine M that has exactly one accepting computational path when there are any number P ≥ 1 of paths in G from s to t, and no accepting computational paths otherwise. It is not known how to solve this problem in this form. However, consider rather than a UL machine, a UL machine which additionally has access to an advice string, which here will be a list of randomized weightings assigned to the edges of G. Then, one uses that after assigning random weightings to the edges with high probability there will be a unique minimal weight path in G from s to t. We build the machine M to only accept on this minimal weight path, which gives it a single accepting computational path.
We can modify this construction to ensure it works with probability one. In particular, there exists a log-space computable function which maps from the advice string and the graph G to a set of n 2 graphs G i , each of which is a weighted version of G, such that for any graph G at least one of the G i has a unique minimal weight path. By exploiting the uniqueness of this path, one can solve DAG in UL. The reader should refer to Ref. [25] for more details.
It remains to remove the need for the UL machine to access the advice string. In Ref. [26], it was shown that this can be done if suitable pseudo-random functions exist. A pseudo-random function is one whose outputs are hard, in a suitable sense, to distinguish from completely random outputs. In particular it is thought that there are pseudo-random functions that are much easier to compute than they are to distinguish from randomness. In the construction above, we used an advice string assigning random weights to the edges in G. We consider replacing this with an assignment by a pseudo-random function p(x) which is computable in log space. This assignment can be made by our UL machine. Then either there is a p(x) which will create a graph G i with a unique minimal weight path, or distinguishing p(x) from a truly random one is no harder than checking that all the G i have non-unique minimal weighted paths. Given what is believed about pseudorandom functions, checking if the G i have unique minimal weight paths would too easily distinguish p(x) from random, so we expect there is a log-space computable function that assigns suitable weightings. From this we conclude that NL = UL.
L (2) and PSP p, (2) In the last section we gave evidence, based on the existence of suitable pseudorandom functions, that L ⊊ PSP p . Unfortunately, we cannot offer similar evidence separating L (2) and PSP p, (2) , although we believe this is the case. More generally, for any classes A, B such that A ⊊ B it is unclear when A (2) ⊊ B (2) . We offer only some comments on this problem.
To understand this separation problem better, first of all consider some cases where A and B do collapse under local pre-processing. Trivial examples occur whenever one of two conditions are met. If there is a promise that the inputs are of the form (x, x), so that both local pre-processors see the full input, then the pre-processed classes A (2) and B (2) both become equal to the set of all functions, since we can have α or β carry out the entire computation. Another collapse occurs when the class B is defined by taking A and allowing for an advice string. In that case having α(x) = (x, a) for a the advice string and β(y) = y collapses the classes. For example 11 , L ⊊ L/poly but this reasoning shows L (2) = L/poly (2) . Our example of A = L and B = Mod p L does not have either of these features, so at the very least it cannot be obviously collapsed in either of these ways.
Another observation is that, when allowing arbitrary pre-processing, all functions are contained in PSPACE (2) . To see why, take α(x) = ( x, f (x, y 1 ), ..., f (x, y 2 n )) and β(y) = y. Then, the local processor need only look up the yth element of the string f (x, y 1 ), ..., f (x, y 2 n ) and output the corresponding bit, and this can be done in P SP ACE. This means for example that P SP ACE ⊆ EXP which is believed strict, but P SP ACE (2) = EXP (2) . Because our classes Mod p L and L are so much weaker than P SP ACE, we do not believe a collapse by any similar mechanism is plausible in our case.
To argue that a maintained separation under pre-processing is at least possible for some classes A and B, we prove such a separation in other cases. Such separations are easy to prove for some low-lying complexity classes using tools from communication complexity. To define communication complexity, consider the following scenario. Alice is given a string x, and Bob a string y. Alice and Bob will communicate by sending classical bits to one another with the goal of determining the output of some Boolean function f (x, y). Unlike in a non-local computation scenario, they can communicate over many rounds. Alice sends Bob a message, then, conditioned on the message he receives, Bob sends Alice a message, etc. The communication complexity is then the total amount of information transferred from Bob to Alice plus the information sent from Alice to Bob. See Ref. [27] for an introduction to communication complexity.
To understand why communication complexity can be used to separate classes with pre-processing, we first need to define the notion of a decision tree. A decision tree defines a simple type of program for computing a Boolean function on n bits. It consists of a directed tree 12 such that except for the leaves and one other vertex specified as the root, every vertex has one edge in and two edges out; a set of queries Q consisting of functions of O(1) input bits; a query q v ∈ Q for each non-leaf vertex v in the graph; and a label for each leaf as either 0 or 1. Starting at the root, for each node v in the tree, the program checks the corresponding q v of the inputs. Based on if that condition is true or false, it moves to the left or right branch from the current node. Eventually the program reaches a leaf of the graph, and outputs the label of that leaf.
Decision tree size is related to communication complexity via the bound [28] depth Q (f (x, y)) ≥ D(f (x, y))/c Q (12) where D(f (x, y)) is the communication complexity of the function f (x, y), and depth Q (f (x, y)) is the minimal depth of a decision tree computing f using the set of queries Q. The constant c Q is defined by c Q = max q∈Q D(q), the communication complexity of an individual query in the worst case. Briefly, this bound holds because a decision tree can be converted into a communication protocol: starting at the root, Alice and Bob communicate to evaluate the first query. This has communication cost at most c Q . Given the output from this query, they follow the decision tree to the next node, and carry out another communication protocol to evaluate the next query. The communication cost is at most c Q times the depth of the tree depth Q (f (x, y)), and this bounds the cost of the best possible protocol D(f (x, y)) from above.
Define the complexity class DT Q (F (n)), consisting of problems solvable using decision trees with depth O(F (n)), and using queries q drawn from some set Q. We claim that DT Q ( √ n) ⊊ DT Q (n), and that DT Q We take the set of queries to be any relation on O(1) inputs, in which case c Q = O (1). To show the first separation, consider the disjointness function This has an obvious decision tree of size n: each node n i checks x i ∧ y i , with the output from that node labelled 0 going to a leaf labelled 0, and the output from n i labelled 1 mapping to node n i+1 . This shows f disj (x, y) ∈ DT Q (n). As well, it is easy to show using lower bounds on communication complexity that D(f disj (x, y)) ≥ n, so from the bound 12 we get that f disj (x, y) ̸ ∈ DT Q ( √ n), separating the two classes. Finally, we show the separation between the corresponding locally pre-processed classes. First, note that f disj (x, y) ∈ DT Q (2) (n), since it is in the smaller class DT Q (n). Next, suppose by way of contradiction that f disj (x, y) ∈ DT Q (2) (

√ n). Then there exists a function
where the first inequality we mentioned above and is easy to prove in communication complexity, and the second inequality is immediate, because the definition of communication complexity allows for local pre-processing with arbitrary functions. Using eq. (12) and which is a contradiction, so there is no such function F . This shows . While the strategy used above is natural to apply to our notion of local pre-processing, it cannot be applied to the classes L and Mod p L. This is because L includes problems which require super-linear decision trees, and D(f ) ≤ 2n always. 13 This means we cannot hope to separate L from a larger class using the bound 12. The technique does generalize to separate DT Q classes of size less than n however, by finding a function with suitable communication complexity, which can always be found. 14 At least for these classes then, adding more computation power to the local computation makes the pre-processed classes larger. Our code-routing protocol improves on the garden-hose strategy if this remains true for the larger classes L and Mod p L. Understanding this for these or other classes however appears challenging, and we have not encountered any techniques for doing so which apply to L and Mod p L.

Upper bounds on efficiently achievable complexity
Theorem 4 lower bounds the complexity of functions that can be completed using coderouting protocols, showing it completes the routing task non-locally at least for functions 13 Using 2n bits of communication, Alice and Bob can send each other their full input strings.
in Mod p L (2) , when restricted to polynomial entanglement. The protocol used to establish this is a restricted one however, and it is natural to ask if the more general procedure can complete functions of higher complexity. To increase the power of the code-routing strategy, we could: • Use other codes. The codes we used that arise from Smith's construction [16] ("Smith codes"), are CSS codes, 15 so it is clear they are a restrictive set.
• Unit-route on predominantly locally-held bits. If most unit-routing is done on bits held by the other player, then the entanglement cost from the necessary teleportations is closely related to the total share size of the codes used. But by unit-routing many shares on locally-held bits, the total share size may not capture the entanglement cost.
• Use adaptive encoding. To prove theorem 4, we used a single, fixed encoding on Alice 0 's side. More generally, which encoding is performed can depend on the classical inputs. As well, shares teleported to Alice 1 's side could be themselves encoded, shares from those teleported back and encoded, etc.
We are not able to fully characterize the complexity of functions that can be achieved with polynomial entanglement using a general combination of the above strategies. We are able however to give a few partial results. To phrase our results, it is helpful to have a notion of size for a protocol. The protocol tape I for a given set of inputs (x, y) (see definition 2), defines a pattern of encoding that we refer to as the protocol tree. Each S i defines a vertex in a directed tree with inputs v i and outputs {w j i }. We define the size of a protocol tree as the number of leaves, plus the number of internal wires that correspond to teleportations. To count this, it is helpful to define n k ≡ |{w j k }|. Then we define the size of a protocol tree as The protocol size counts the number of shares which are either unit-routed or teleported. This lower bounds another quantity of interest, which is the total log dimension of all the shares either unit-routed or teleported during the protocol, which we call the weighted protocol tree size and denoteH (x,y) . To count this, it is helpful to defineñ k = i log dim w i k . Then we haveH If a share is unit-routed on a bit that is on the same side as the share, there is zero entanglement cost, while if the share is on the opposite side, there is an entanglement cost given by the log dimension of the share. Each share which is teleported gives an entanglement cost equal to the log dimension of that share. Our assumption in the converse results below will be that a polynomial in the entanglement cost upper bounds the weighted protocol tree sizeH (x,y) ≤ poly(E). This is our precise statement of not too many unitroutings being performed on locally held bits.
We begin with the following theorem, which shows code-routing using Smith codes is in P (2) , under our assumption relating protocol tree size and entanglement cost. We can also strengthen this to Mod p L (2) if the protocol tree is O(1) depth, or L (2) if each encoding has O(1) size. Theorems 8 and 9 also have alternative proofs in terms of composed span programs, which we haven't included here.
Theorem 8 Consider a code-routing protocol which uses only Smith codes, uses E = poly(n) copies of the maximally entangled state of two qupits, and has protocol trees with size related polynomially to their entanglement cost. Then we can determine the outcome of the protocol in P (2) , polynomial time with local pre-processing.
Proof. We will give an explicit poly(E) time algorithm. Recall that the protocol tape consists of a list I = (a(x), S 1 , ..., S ℓ , b(y), S ℓ+1 , ..., S ℓ+ℓ ′ ) (18) and each S i = (v i , {w j i }, T i ) describes a unit-routing, teleportation, or encoding. By assumption, the encoding here corresponds to a Smith code. It will be convenient in this proof to take T i to be a description of the span program defining that Smith code. To denote this, when the third entry describes an encoding, we will use the labelling SP i rather than Recall also that the size of the span program is equal to the number of rows in its matrix.
Given this representation of the protocol, we define the following recursive function which takes a tuple S k as input and returns 0 if Alice 0 is able to reconstruct the input share v k , or returns 1 if Alice 1 is able to reconstruct the share v k . In the pseudo-code below, we denote a span program by SP k , where each span program is defined by a tuple SP k = (M k , ϕ k , t k ), where function ϕ k maps from a row index i to a pair (j, ϵ i ), as explained in appendix A. We use the notation ϕ k (i) [1] = j. Note that Smith codes are defined by monotone span programs, meaning that ϵ i = 1 always.

Define GetOwner(S k , I):
If n k = 0, Return T k If n k = 1, Search for S i ∈ I with w 0 i as its input, call it S j Return GetOwner(S j , I) Then, our program is as follows: Find the tuple S i with Q as its input, call it S k Return GetOwner(S k , I) It is straightforward to see that this algorithm is correct using an inductive proof, where we induct on layers in the protocol tree. Here, we say that the layer of a node is the maximal length of a path from that node to a leaf. The 0th layer -the leaves of the tree -all correspond to unit-routings, where the algorithm is manifestly correct: unit-routings have n k = 0, and T k is a bit labelling the side that the input share is brought to in the protocol. The algorithm just returns this bit directly, which is correct. Now assume by way of induction that the algorithm behaves correctly on tuples S k ′ at layer m of the protocol tree, and consider its behaviour on a tuple S k at the m + 1th layer. We have that n k ̸ = 0, so we need only consider the cases where n k = 1 or n k > 1.
For n k = 1 the protocol has teleported v k into system w 0 i , which is in the mth layer, so the algorithm returns the side where w 0 i is brought, which is correct. For n k > 1, the share v k has been recorded into a secret sharing scheme. The scheme is defined by a span program, and records v k into a set of shares {w i k }. The scheme's indicator function is computed by a monotone span program (M k , ϕ k , t k ). The share v k will be recoverable on the side labelled by the output of the span program. The inputs to the span program z i are determined by where the protocol brings the shares w i k , with z i = 0 meaning share w i k is on the left and z i = 1 meaning share w i k is on the right. Share v k is then available on the side labelled by the indicator function evaluated on the string z. The algorithm works by evaluating the span program, and calling the GetOwner(·, I) function recursively to determine on which side the shares w i k are recoverable. In particular the matrix M 1 k includes t k in its span exactly when the span program evaluates to 1, so the algorithm correctly returns 1 when v k is on the right. When the set of shares on the right does not reveal v k it must, because we used a secret sharing scheme, reveal nothing about v k . Because we always maintain the purifying system on the left, v k is then available on the left. Accordingly, the algorithm correctly returns 0 in this case.
Next we analyze how the run time relates to the entanglement cost. Begin by considering the run time for each call to GetOwner(S k , I). The run time is dominated by the step where we determine whether an e-dimensional vector t k lies in the span of another set of |M 1 k | vectors. This can be done in O(e|M 1 k |) steps. The length of the rows is always less than or equal to the total number of them, since the columns are linearly independent 16 , so e ≤ size(SP k ). The number of rows in M 1 k is less than or equal to the total number of rows in the span program, so |M 1 k | ≤ size(SP k ). Together these give O(e|M k |) < O(size(SP k ) 2 ). In a Smith code, the total share size is given by the size of the span program, soñ k = size(SP k ). Finally, note that on a given input pair (x, y) only certain span programs from the full collection {S k } are reached in the algorithm. Call this collection S (x,y) . Thus we can bound the total run time for a given x and y by where N (x,y) = kñk is the total size of all shares used across all encodings involved in the protocol, on inputs (x, y). We would like to relate this run time to the protocol tree size, as defined in eq. (17). For fixed N (x,y) , the weighted protocol tree size is minimized for the case where n k = 2 for all encodings (this maximizes the subtractions appearing in eq. (17)), so thatH (x,y) ≥ N (x,y) /2 (20) where we've also used thatñ k 2 ≥ log dim v k , i.e. that each share in the code is at least as large as the input system. Since by assumption the entanglement cost is polynomially related to the weighted size, combining this with eq. (19) we have a polynomial upper bound on the run time in terms of entanglement cost. Note that this polynomial time computation is performed by taking the protocol tape as input, which itself is computed via local pre-processing, so the entire protocol is in P (2) .
For certain classes of code-routing protocols, we can determine their output in smaller classes than P (2) . This is possible in two cases: protocols which never concatenate codes to depth more than O(1), and protocols which are built by concatenating codes of O(1) size. We can understand the first of these as a small relaxation of the single-encoding protocol given in theorem 4, and the second as a small relaxation of the garden-hose protocol. In both cases deforming these protocols slightly doesn't add computational power. We discuss these two cases in the following subsections.

Protocols using O(1) depth encodings
We first discuss the following theorem, which modifies the protocol used in theorem 4 to allow O(1) depth of encodings and shows the resulting protocols still compute functions inside the class Mod p L.
Theorem 9 Consider a code-routing protocol which uses only Smith codes, takes n bits as input, uses E = poly(n) copies of the maximally entangled state of two qupits, has protocol trees with size related polynomially to their entanglement cost and which have O(1) depth. Then the outcome of the protocol can be computed in Mod p L (2) .
Our proof will use the following characterization of Mod p L in terms of non-deterministic Turing machines. For any non-deterministic Turing machine T we define the function F(T ) as follows. For a given input x, call the number of accepting paths F (x). We then define F(T )(x) = 1 when F (x) is non-zero mod p, and return F(T )(x) = 0 otherwise. Then the class Mod p L is the set of functions of the form f = F(T ) where T has O(log(n)) memory for n the length of x. Note that because Smith codes of polynomial size are evaluated by polynomial sized span programs, and hence in P SP p , and recalling that Mod p L = P SP p [23], we have that they can also be evaluated by non-deterministic Turing machines with O(log(n)) memory that count paths mod p.
To prove theorem 9, we first need the following lemma, which will allow us to compose Mod p L machines in a simple way.
Lemma 10 Suppose we have a function f = F(T ) for a non-deterministic Turing machine T running on memory m = Ω(log n) where n is the length of x. Then there is another non-deterministic Turing machine T ′ that uses memory O(m), has f (x) mod p accepting paths (and therefore still satisfies f = F(T ′ )), and has 1 − f (x) (mod p) rejecting paths.
Proof. We will start with any Turing machine M 0 such that f = F(M 0 ), and from it construct a new machine M 2 whose number of accepting and rejecting paths will satisfy the statement of the lemma. As an intermediary, we need another Turing machine M 1 . We will use F i (x) to denote the number of accepting paths in Turing machine M i run on input x, andF i (x) the number of rejecting paths.
The machine M 1 uses p − 1 copies of M 0 , which we label M Consequently, the number of accepting paths is where in the second line we've used Fermat's little theorem. Next, we build the machine M 2 . Define as needed.
Notice that M 2 involves running M 0 an O(1) number of times sequentially, storing j, and keeping track of the i counter. All this can be done in O(m) memory. Now we are ready to prove the main theorem of this section.
Proof. (Of theorem 9) We use the description of the protocol in terms of a protocol tape.
Recall that when S i has no output shares, the tuple S i = (v i , ∅, T i ) describes a unitrouting of the share v i to the side labelled by z T i , which is a bit of z = (a(x), b(y)).
When S i has one output share, S i = (v i , w 0 i , ∅) describes a teleportation. Finally when S i has more than one output share, the tuple describes an encoding. The encoding is into a Smith code, so the indicator function f i can be computed with a span program of sizeñ i . To find a Turing machine such that f i = F(T ), we need only memory O(logñ i ). From lemma 10 then, we can construct a non-deterministic Turing machine T i , also with memory O(logñ i ), such that T i has f i (x) mod p accepting paths and 1 − f i (x) mod p rejecting paths.
We consider a function L(s, I), which takes a share v and determines if that share is on the left (corresponding to output 0) or the right (corresponding to output of 1) at the end of the protocol defined by input tape I. We define L(s, I) recursively, as follows.
Define L(s, I): Search through I and find Note that this machine does not compute each of the L(w j i , I) and store them -that would already be n i bits of memory. Instead it computes L(w j i , I) each time it needs that bit value, and can re-use the same memory bits each time it does this. The output of the entire protocol is determined by running L(Q, I), where Q is the input system to be routed.
L(Q, I) determines the output for the protocol, but we need to show this function can be evaluated by a Mod p L machine. To do so, we modify L(s, I) to a new function L T (s, I) by making the replacement f i → T i , where T i is a Turing machine constructed using lemma 10. L T (Q, I) can be run on a non-deterministic machine, and we can consider counting the number of accepting paths. Our claim is that 1) this correctly determines the output of the protocol in that F(L T (s, I)) = L(s, I) and 2) L T (Q, I) runs in nondeterministic log-space, so that we've computed the output of the protocol in Mod p L.
First consider correctness. We work inductively in the layers of the protocol tree, where the layer of a node is defined as before to be the maximal length of a path from the node to a leaf. We will show for each layer that, for any node in that layer, the number of accepting paths is equal, mod p, to the output of the corresponding function and further that the number of rejecting paths is equal, mod p, to 1 minus the value of that function.
First consider the 0th layer, i.e. the leaves of the tree, which will always consist of unit-routings. These are deterministic computations, consisting of returning z T i (which in this case is a single bit). They return z T i if and only if there is z T i accepting paths, and have 1 − z T i rejecting paths, so this is correct.
Next consider the m+1th layer of the protocol tree, and assume the inductive hypothesis for the mth layer. For an encoding, to evaluate the function f i on a log-space machine we need non-determinism. Consider the function f i , its corresponding Turing machine T i , and focus on one input to f i , say z * . By construction, for a definite input (or a single path) with z * = z, we know T i has f i (z) accepting paths and 1 − f i (z) rejecting paths. Now suppose we replace the input z * with calls to a non-deterministic Turing machine T * at the mth layer. Then including all input paths from T * as well as all paths for T i itself, the number of accepting paths for T i is the number of accepting paths for T i given z * = 1, times the number of accepting paths for T * , plus the number of accepting paths for T i given z * = 0, times the number of rejecting paths for T * . Using that the number of accepting paths of T i is f i (z * ), and rejecting paths is 1 − f i (z * ), and a similar statement for T * and associated function f * = F(T * ), we have that the number of accepting paths for T i is Notice that for z * = f * = 1 mod p, we have F i = f i (1), so the number of accepting paths is as if z * were given deterministically. Similarly if z * = f * = 0, F i = f i (0), which again is the same as if z * were given deterministically. In particular, the number of accepting paths satisfies the requirements of the inductive hypothesis. The number of rejecting paths of T i isF using similar reasoning to above. Thus for z * = f * = 1, we have that the number of rejecting paths is 1−f i (1), and for z * = f * = 0, we have that the number of rejecting paths is 1 − f i (0), so that the number of rejecting paths also satisfies the inductive hypothesis. This argument also gives correctness in the case of a teleportation, since teleportation is a special case of the above where T i is deterministic.
Finally we need to determine the memory usage of this algorithm. The needed memory is to evaluate the Turing machines at each layer, which each use logñ i memory, whereñ i is the log dimension of the output shares of tuple S i . Calling Turing machines recursively, we can re-use memory for machines at the same layer of recursion, but must add the memory requirements for machines at different layers. Adding log |{S i }| bits of memory for the search through the list of the S i , calling L T (Q, I) uses M (x,y) = max paths p i∈p bits of memory. The second term is bounded by logH (x,y) forH (x,y) the weighted size of the protocol tree, since each S i adds at least 1 to the size of the protocol tree. Finally, note that the length of the path is bounded by the depth of the protocol tree. Then using our assumption that we have at most O(1) depth, and becauseñ i ≤H (x,y) , we have BecauseH (x,y) is related polynomially to the entanglement cost, we've proven the theorem.

Protocols using codes of O(1) size
In this section we consider protocols that use only codes with O(1) shares. Recall that the garden-hose protocol corresponds to the case where encodings are size 1, and the efficiently computable functions in that case is the class L (2) . The following theorem shows that with small codes the complexity is not increased. Note that this is our only converse theorem where we do not restrict to Smith codes.
Theorem 11 Consider a code-routing protocol that takes n bits as input, uses E = poly(n) copies of the maximally entangled state of two qupits as a resource, has protocol trees with size related polynomially to the entanglement cost, and uses codes with at most O(1) shares. Then the outcome of the protocol can be computed in L (2) .
Proof. The strategy is to use a depth-first evaluation of the protocol tree, which recall is defined by the protocol tape I. One apparent obstruction is that for deep trees, keeping track of a path from root to leaf can require linear memory. To avoid this, we travel through the tree while only keeping the current, and sometimes proceeding or subsequent, vertices in memory. Heuristically, our algorithm works by "pruning" the protocol tree, evaluating sub-trees and storing the ownership of shares corresponding to edges of the tree. To store the full protocol tree would require too large of a memory, so instead we describe the pruned tree using the protocol tape I along with a set R, which contains edges that "over-ride" the description of the tree given by I. At any given point in the running of the algorithm, R will only describe the ownership of vertices that neighbour the current vertex v being evaluated. Because the tree has vertices only with O(1) degree, it is possible to store R in logarithmic memory. By repeatedly pruning the initial tree, eventually we are left with a trivial tree that points to the location of the input share.
We give the pseudo-code for our algorithm now, then make a few comments on this code below.
If R contains an S i with v as input, In the definition of F [v i , I, R], the line which assigns S ′ i the value (v i , ∅, f i ) needs some explanation. According to our conventions, when the output systems are empty, the third entry in an S i tuple is just a bit. Here, we use the value of f i . The inputs to f i are determined by the locations of the w j i shares, but by construction we are in a case where these are easy to look up, since the w j i s are all leaves. Further, because the code sizes are all O(1) here, this can be done in O(1) memory. One other line that requires explanation is the one that finds a w * of maximal layer. First, note that the layer of a node can be evaluated in log-space, because it amounts to determining the depth of the sub-tree defined by that node and all its descendants. Second, the layer of each of the children of the current node can all be stored simultaneously, because (i) there are only O(1) children, and (ii) the layer is bounded by the depth of the protocol tree, which is at most polynomial in n by the assumptions of the theorem, and thus can be stored in log(n) bits.
To understand the correctness of the algorithm, we will make use of a notion of an effective protocol tree. This is the tree as described by R taken together with I, where R is always 'given priority'. In particular, if v i is an input to S ′ i ∈ R and S i ∈ I, we use S ′ i when travelling to subsequent nodes in the tree. We define the effective size to be the number of vertices in the effective protocol tree. 17 We claim that the effective tree constructed during the running of the above algorithm evaluates to the same value as the original tree at every step. Further, effective size decreases every time the first If statement is called, and eventually reaches 1.
To see the first claim, consider that at the start of the algorithm R = {}, so the effective and original protocol trees agree, and so in particular give the same output. Next, suppose that the effective and original protocol trees give the same output, and then consider how R is edited during one evaluation of the code inside the first If statement of F . This involves replacing S i with S ′ i which is a unit-routing that has the same output as S i . Manifestly this doesn't change the output. Further, we remove the descendants of S i , which are never visited in the new effective tree, so this also does not change the output. Now consider the second claim, that the effective tree becomes smaller and eventually reaches size one. Notice that we must reach the first If statement eventually, specifically after at most a number of calls to F equal to the depth of the effective tree. In particular each time the second Else statement is called, F is called on a lower vertex in the effective tree. Once the call is to a vertex with only leaves as descendants, it goes to the first If statement. Next, notice that S i is replaced with S ′ i only when S i has descendants, and that by construction S ′ i is a leaf. Thus every such move decreases the effective size. Notice further that the algorithm can only end when reaching the single return statement. This happens when there is no node preceding the current one in the effective tree, so that the tree has size one. The algorithm then returns f i from the effective tree, which by the correctness property above is the output of the protocol tree.
Consider the memory usage of this algorithm. We evaluate indicator functions f i for O(1) size codes, which can be done with O(1) memory. Additionally, we need to keep track of the current node v i , which can be done with log |{S i }| memory. Notice that we have been careful to erase the record of the path followed to reach the current vertex, by erasing the stored v i value before calling F on a new one, since storing this path would require super-logarithmic memory. Finally, we track the entries in R, which defines the effective tree. We claim R only ever contains S i which are all descendants of a single node, so storing R only requires O(1) memory. To see why this is the case, notice that because we travel to the node of maximal layer when traversing the tree, we visit nodes depth-first. This guarantees that once a vertex is added to R, we completely finish evaluating the ownership of its parent before proceeding to the next vertex, as we are already at the deepest part of the tree.
Considering all contributions listed in the last paragraph, memory cost is O(log |{S i }|). H (x,y) ), since each S i adds at least 1 to the protocol tree size. Then since H (x,y) ≤H (x,y) andH (x,y) is upper bounded by a polynomial in n, we are done.

Discussion
The f -routing task is of practical relevance in the context of position verification, but also exhibits interesting relationships to complexity theory and secret sharing. In particular, the garden-hose protocol uses entanglement controlled by the space complexity of f , and the code-routing strategy we introduce here has an entanglement cost upper bounded by span program size. With regards to secret sharing, we showed the size of a secret sharing scheme with indicator function f is lower bounded by the entanglement cost of performing the corresponding f -routing task. These connections to complexity and secret sharing emphasize the importance, and difficulty, of finding lower bounds on entanglement cost in f -routing. In particular, such lower bounds would strengthen the security of position verification schemes based on frouting, and amount to lower bounds on span program size and the size of secret sharing schemes. In general, proving lower bounds on complexity is a challenging goal, and in the case of span programs there has been only limited success [23]. 18 Given this, we might not expect to prove strong lower bounds on entanglement cost. Alternatively, we could hope for conditional lower bounds based on complexity-theoretic assumptions, or for lower bounds stated in terms of some measure of the complexity of f . We leave exploring this further to future work.
Finally, note that this work introduces the use of error-correction in non-local quantum computation. By combining error-correction with the teleportation techniques of [10], we increase the complexity of functions that can be computed non-locally (at least given our complexity-theoretic assumptions). It would be interesting to understand if error-correcting codes provide enhancements to other non-local computation protocols, for instance the one based on the Clifford+T gate set described in [30].

Definition 13
The size of a span program is defined to be d, the number of rows in M . (M, ϕ, t), the function it computes is given according to the following rule. Given an input string z of n bits, if the vector t is in span({r i : ∃j, ϕ(r i ) = (j, z j )}), then output 1. Otherwise, output 0. To unpack this, we understand ϕ(r i ) = (j, ε i ) as saying that row r i maps to some index, j, which labels a bit in the input string z. If that bit z j is equal to ε i , we include that row. Repeating this for all rows, we check if the target vector t is in the span.
A span program is said to be monotone if it has ε i = 1 always. This ensures that changing bit values in z from 0 to 1 always adds to the set of rows whose span we are checking, so that monotone span programs always compute monotone functions. Conversely, every monotone function can be computed by a monotone span program [23], as is easy to verify.
It will be helpful to introduce some notation dealing with span programs. For a given input z, the map ϕ picks out some of the rows of M , whose span will then be checked to see if it includes the target vector. The subset of rows picked out we will denote by ϕ −1 (z), and refer to as the activated rows. The matrix formed from the activated rows we denote M ϕ −1 (z) . The minimal size of a span program over Z p computing a function f is denoted SP p (f ).

B Proof of lemma 5
We are now ready to prove lemma 5, which we repeat below for convenience. such that • f I is a valid indicator function • g acts on the first m bits of its input by copying each bit z i and negating one copy, z i → (z i , ¬z i ). It leaves the final bit b unchanged. • mSP p (f I ) ≤ SP p (f ) + 1, where SP p (h) denotes the minimal size of a span program over Z p computing h, and mSP p (h) the size of a monotone span program computing h.
Proof. Given f , find the minimal sized span program over Z p that computes f , and label it (M f , ϕ f , t f ). Label the rows of M f by r i . Then, add one row and one column to M f to define a new matrix M f ′ with dimensions (d + 1) × (e + 1). Label the rows of M ′ f as r ′ i . Set (M f ′ ) d+1,e+1 = 1 and otherwise the added row and column entries are set to be 0. Extend ϕ f to a new function ϕ f ′ such that ϕ f (r i ) = ϕ f ′ (r ′ i ) for all i ≤ d, and ϕ f ′ (r ′ d+1 ) = (m + 1, 1). Finally, let t f ′ = (t f , 1). Then (M f ′ , ϕ f ′ , t f ′ ) defines a new function f ′ , given by f ′ (z, b) = f (z) ∧ b, so in particular f ′ (z, 1) = f (z).
Next, we decompose f ′ into f I and g. Define g k (z k ) : {0, 1} 1 → {0, 1} 2 according to g k (z k ) = (z k , ¬z k ) (27) Then define g by having g k act on each of the first m bits of the input, producing a string of length 2m + 1. The function f I is now defined by modifying the span program (M f ′ , ϕ f ′ , t f ′ ) to take the output of g as input. First, the new span program has the same matrix and target vector as before: M I = M f ′ and t I = t f ′ . Second, define ϕ I by having it map r ′ i to the same input bit as ϕ f ′ when ϵ f ′ ,i = 1, and to the negated copy of that input bit when ϵ f ′ ,i = 0. Set ϵ I,i = 1 always. This ensures that f I and the span program computing it are monotone, but f ′ = f I • g. Additionally, every (z, b) value which has f I (z, b) = 1 must have b = 1, so f I is also no-cloning. Since secret sharing schemes can be built for any function that is no-cloning and monotone [19,16], f I is a valid indicator function. Finally, notice that the monotone span program computing f I is the same size as the (non-monotone) span program computing f ′ , which in turn has one extra row as compared to the program for f . We conclude with an example. Consider the function f (x, y) = x⊕y. A (non-monotone) span program for this function has matrix The map ϕ is defined by ϕ(r 1 ) = (1, 1), ϕ(r 2 ) = (1, 0), ϕ(r 3 ) = (2, 1), ϕ(r 4 ) = (2, 0), and the target vector is (1, 1). It is easy to check cases to confirm this computes x ⊕ y. We decompose this in the manner described in lemma 5. First, add one column and one row to the matrix according to We add one bit to the inputs, extend the map ϕ according to ϕ(r 5 ) = (3, 1), and append a 1 to the target vector. This span program defines the function f ′ (x, y, b) = (x ⊕ y) ∧ b.