Depth-efficient proofs of quantumness

A proof of quantumness is a type of challenge-response protocol in which a classical verifier can efficiently certify the quantum advantage of an untrusted prover. That is, a quantum prover can correctly answer the verifier's challenges and be accepted, while any polynomial-time classical prover will be rejected with high probability, based on plausible computational assumptions. To answer the verifier's challenges, existing proofs of quantumness typically require the quantum prover to perform a combination of polynomial-size quantum circuits and measurements. In this paper, we give two proof of quantumness constructions in which the prover need only perform constant-depth quantum circuits (and measurements) together with log-depth classical computation. Our first construction is a generic compiler that allows us to translate all existing proofs of quantumness into constant quantum depth versions. Our second construction is based around the learning with rounding problem, and yields circuits with shorter depth and requiring fewer qubits than the generic construction. In addition, the second construction also has some robustness against noise.


Introduction
Quantum computation is currently in the era of noisy intermediate-scale (NISQ) devices [Pre18]. This means that existing devices have a relatively small number of qubits (on the order of 100), perform operations that are subject to noise and are not able to operate fault-tolerantly. As a result, they are limited to running quantum circuits of small depth in order to obtain high fidelity outputs. Despite these limitations, there have been a number of demonstrations of quantum computational advantage [AAB + 19, ZWD + 20, WBC + 21, ZCC + 22], i.e. performing a task on a quantum device that cannot be efficiently reproduced by classical computers, based on plausible complexity-theoretic assumptions [AA11,HM17,BFNV19]. Indeed, with the best known classical algorithms it takes several days of supercomputing power to match the results of the quantum devices, which required only a few minutes to produce [HZN + 20].
These milestone results illustrate the impressive capabilities of existing quantum devices and highlight the potential of quantum computation in the near future. Yet, one major challenge still remains: how do we know whether the results from the quantum devices are indeed correct? For the existing demonstrations of quantum advantage, verification is achieved using various statistical tests on the output samples from the quantum devices [AAB + 19, BIS + 18, ZWD + 20]. However, performing these tests either involves an exponential-time classical computation or there is no formal guarantee that an efficient classical adversary cannot spoof the results of the test [AC17, AG20,PR22].
One conceptually simple way to demonstrate quantum advantage, that's also efficiently verifiable, is to ask the quantum computer to factor large composite integers using Shor's algorithm [Sho94]. Assuming factoring is classically intractable, this task yields a quantum advantage and is tractable to verify (simply multiply the output factors and check if they produce the number to be factored). However, Shor's algorithm requires fault-tolerant quantum computation to perform and so is not suitable for near-term devices [GE21].
An alternative way of performing efficient tests of quantum advantage was initiated by the work of Brakerski et al. in [BCM + 18]. There, the authors proposed an interactive protocol between a polynomial-time classical verifier and a self-claimed polynomial-time quantum prover. The verifier issues a number of challenges to the prover and checks the prover's responses, accepting only when the prover answers the challenges correctly. The defining property of such a protocol is that no polynomial-time classical prover can make the verifier accept with high probability, but there exists a quantum polynomial-time strategy that makes the verifier always accept. This is referred to as a proof of quantumness protocol. The protocol of Brakerski et al. is based around a family of collision-resistant hash functions known as trapdoor claw-free functions (TCFs) 1 . In essence, for the quantum prover to correctly answer the verifier's challenges, one of the things it is required to do is evaluate these functions in superposition. With the trapdoor, the verifier is able to check whether the prover performed this evaluation correctly. It can also be shown that for any classical prover to succeed in the protocol, it would effectively have to find collisions for the TCFs. Brakerski et al. showed that TCFs can be constructed assuming the intractability of the learning with errors (LWE) problem [Reg09]. In effect, this shows that efficient classical provers cannot succeed in the proof of quantumness, unless LWE is classically tractable. Subsequent works have also shown that TCFs can be based on other problems assumed to be classically intractable, such as factoring, the discrete logarithm problem or ring learning with errors [KMCVY22]. Additionally, TCF-based proofs of quantumness can also be made non-interactive in the random oracle model [BKVV20]. In all of these cases, however, to succeed in the protocol the ideal quantum prover must evaluate the TCFs coherently and this requires, at best, logarithmic quantum depth [GH20].
It is thus the case that, on the one hand, we have statistical tests of quantum advantage that are suitable for NISQ computations but which either require exponential runtime or do not provide formal guarantees of verifiability. On the other hand, we have proofs of quantumness based on plausible computational assumptions, but that are not suitable for NISQ devices, as they require running deep quantum circuits. Is it possible to bridge the gap between the two approaches? One step towards that goal would be to construct proofs of quantumness where the prover is only required to perform constant-depth quantum circuits (together with short-depth classical circuits). This would also answer an important theoretical question: can one achieve quantum advantage with constant-depth quantum circuits while also being able to classically verify the results in polynomial time? This is the main result of our work: we give two proof of quantumness constructions in which the prover's evaluation can be performed in constant quantum depth and logarithmic classical depth. For the purposes of certifying quantum advantage, this leads to highly depth-efficient proofs of quantumness. Both constructions also yield depth-efficient protocols for certifiable randomness generation, based on the scheme from [BCM + 18]. The first construction is a generic compiler that can take existing proof of quantumness protocols, based on TCFs, and convert them into constant-depth versions. The second construction uses a specific TCF based on the learning with rounding (LWR) problem [BPR12] and achieves circuits of smaller width and with some amount of noise robustness compared to the generic construction.

Proofs of quantumness
To explain our approach, we first need to give a more detailed overview of TCF-based proof of quantumness protocols. As the name suggests, the starting point is trapdoor claw-free functions. A TCF, denoted as f , is a type of 2-to-1 one-way function-a function that can be evaluated efficiently (in polynomial time) but which is intractable to invert. The fact that the function is 2-to-1 means that there are exactly two preimages for each image of the function. The function also has an associated trapdoor which, when known, allows for efficiently inverting f (x), for any x. Finally, "claw-free" means that, without knowledge of the trapdoor, it should be intractable to find a pair of preimages, x 0 , x 1 , such that f (x 0 ) = f (x 1 ). Such a pair is known as a claw.
For many of the protocols developed so far, an additional property is required known as the adaptive hardcore bit property, first introduced in [BCM + 18]. Intuitively, this says that for any x 0 it should be computationally intractable to find even a single bit of x 1 , whenever f (x 0 ) = f (x 1 ). As was shown in [KMCVY22], this property is not required in order to construct proof of quantumness protocols, provided one adds an additional round of interaction in the protocol, as will become clear later. We will refer to TCFs having the adaptive hardcore bit property as strong TCFs. More formally, there exists λ 0 > 0, such that for any λ > λ 0 , known as the security parameter, a strong TCF, f , is a 2-to-1 function which satisfies the following properties: 1. Efficient generation. There is a poly(λ)-time algorithm that can generate a description of f as well as a trapdoor, t ∈ {0, 1} poly(λ) .
6. Adaptive hardcore bit. Any poly(λ)-time algorithm succeeds with probability negligibly close to 1/2 in producing a tuple (y, x b , d), with b ∈ {0, 1}, such that It should be noted that the properties, as stated here, are not independent of each other. For instance, property 6 implies properties 3 and 5 (and 5 also implies 3). We chose to present the properties this way for the sake of clarity. Without the requirement of an adaptive hardcore bit, we recover the definition of an ordinary or regular TCF. Note that all poly(λ)-time algorithms mentioned above can be assumed to be classical algorithms.
We now outline the proof of quantumness protocol introduced in [BCM + 18]. The classical verifier fixes a security parameter λ > 0 and generates a strong TCF, f , together with a trapdoor t. It then sends f to the prover. The prover is instructed to create the state (1) and measure the second register, obtaining the result y. Note here that the input to the function was partitioned into the bit b and the string x, of length λ − 1. The string y is sent to the verifier, while the prover keeps the state in the first register, with f (0, x 0 ) = f (1, x 1 ) = y. The string y essentially commits the prover to its leftover quantum state. The verifier will now instruct the prover to measure this state in either the computational basis, referred to as the preimage test or the Hadamard basis, referred to as the equation test, and report the result. For the preimage test, the verifier simply checks whether the reported (b, x b ) of the prover satisfies f (b, x b ) = y. For the equation test, the prover will report (b , d) ∈ {0, 1} × {0, 1} λ−1 and the verifier checks whether d · (x 0 ⊕ x 1 ) = b . (2) In this case, the verifier has to use the trapdoor to recover both x 0 and x 1 from y in order to compute Equation 2. It is clear that a quantum device can always succeed in this protocol by following the steps outlined above. However, the properties of the strong TCF make it so that no polynomial-time classical algorithm can succeed with high probability. At a high level, the reason for this is the following. Suppose a classical polynomial-time algorithm, A, always succeeds in both the preimage test and the equation test. First, run A in order to produce the string y. Then, perform the preimage test with A, resulting in (b, x b ), such that f (b, x b ) = y. Since A is a classical algorithm, it can be rewound to the point immediately after reporting y and now instructed to perform the equation test. This will result in the tuple (b , d) such that d · (x 0 ⊕ x 1 ) = b . Importantly, f (0, x 0 ) = f (1, x 1 ) = y. We therefore have an efficient classical algorithm that yields both a hardcore bit for a claw as well as one of the preimages in the claw. As this contradicts the adaptive hardcore bit property, no such algorithm can exist.
As explained in [KMCVY22, BKVV20, ZKML + 21], the above argument can be made robust so that the success probabilities of any polynomial-time classical strategy in the two tests satisfy the relation p pre + 2p eq − 2 ≤ negl(λ) where p pre denotes the success probability in the preimage test, p eq is the success probability in the equation test and negl(λ) is a negligible function in the security parameter λ. The protocol described above crucially relies on the adaptive hardcore bit property to achieve soundness against classical polynomial-time algorithms. Thus far, this property has only been shown for TCFs constructed from LWE [BCM + 18]. It should also be noted that the above protocol is also a scheme for certifiable randomness generation: the bit b obtained in the preimage test can be used as statistical randomness.
Is it possible to construct proof of quantumness protocols based on other computational assumptions than the classical intractability of LWE? Yes, in fact it is not difficult to see that simple proofs of quantumness can be based on the classical intractability of factoring or the discrete logarithm problem (DLP): ask the prover to solve multiple instances of these problems using Shor's algorithm [Sho94]. Since their solutions can be classically verified efficiently and since the problems are assumed to be classically intractable, this immediately yields a proof of quantumness. The issue with doing this is that the prover has to run large instances of Shor's algorithm, which would require a fault-tolerant quantum computer [GE21]. Instead, as was shown in [KMCVY22], one can construct proofs of quantumness based on factoring or DLP, in which the prover can implement smaller circuits than those required for Shor's algorithm. Such protocols would then be more amenable to experimental implementation on near-term devices.
Let us briefly outline the approach in [KMCVY22]. The idea is to consider TCFs that need not satisfy the adaptive hardcore bit property. Such TCFs can be constructed from more varied computational assumptions than LWE, including factoring, DLP or the ring-LWE problem [LPR10]. All of these are generally considered to be standard computational assumptions. Having such a TCF, the protocol then proceeds in the same way as the one outlined above: the verifier requests that the prover prepare the state in 1, measure the function register obtaining the string y and then send it to the verifier. The prover will be left with the state from 1. As before, the verifier will then instruct the prover to perform either a preimage test or an equation test. The preimage test is unchanged: the prover is asked to measure the state from 1 in the computational basis and report back the result.
For the equation test, however, the verifier will first sample a random string v ∈ {0, 1} λ and send it to the prover. The prover must then prepare the state The x register is measured in the Hadamard basis, resulting in the string d ∈ {0, 1} λ−1 which is sent to the verifier. Upon receiving d, the verifier chooses a random φ ∈ {π/4, −π/4} and asks the prover to measure its remaining qubit in the rotated basis 1} the prover's response, the verifier uses d and the trapdoor to determine which b is the likely outcome of the measurement and accepts if that matches the prover's response. The last step in the protocol is reminiscent of the honest quantum strategy in the CHSH game for violating Bell's inequality [CHSH69]. In fact, much like in the CHSH game, the success probability of any classical prover in this protocol is upper bounded by 0.75 + negl(λ), whereas a quantum prover can succeed with probability cos 2 (π/8) ≈ 0.85. For this reason, the authors of [KMCVY22] refer to the protocol as a computational Bell test.
The soundness against classical polynomial-time algorithms follows from a similar rewinding argument to the one outlined for the previous protocol, which used a strong TCF. The main difference is that in this case the verifier introduces an additional challenge for the prover, in the form of the string v and the bit m, from the modified equation test. This equation test is still checking for a hardcore bit of a claw, but unlike the previous protocol, the hardcore bit is no longer adaptive. Intuitively, this is because the verifier chooses which hardcore bit to request; a choice encapsulated by v and m. For more details, we refer the reader to [KMCVY22].

Our results
In the proofs of quantumness outlined above, the honest quantum prover needs to coherently evaluate a TCF in order to pass the verifier's tests. A first step towards making the protocol depth-efficient would be to make it so that the prover can evaluate the TCF in constant quantum depth. In fact, all that is required is for the prover to prepare the state from 1 in constant depth, since the remaining operations can also be performed in constant depth. To that end, we first give a generic construction allowing the prover to prepare the state in 1, in constant depth, for all existing TCFs. We then consider a second construction with a TCF based on the learning with rounding (LWR) problem [BPR12] (a problem that is, for all intents, equivalent to LWE in terms of computational intractability) in which the prover will prepare a state that is essentially equivalent to that in 1. The advantage of this second construction is that the resulting circuits have smaller depth, smaller width (requiring fewer qubits) and have a certain degree of noise robustness, compared to the generic construction. The first construction is presented in detail in Section 3, while the second is in Section 4.

First construction -A generic compiler
We start with the observation from [GH20] that the strong TCFs based on LWE can be evaluated in classical logarithmic depth. In fact this also holds for the TCFs based on factoring, DLP and ring-LWE from [KMCVY22]. As in [GH20], one can then construct randomized encodings for these TCFs, which can be evaluated by constant depth classical circuits. A randomized encoding of some function, f , is another function, denotedf , which is information-theoretically equivalent to f . In other words, f (x) can be uniquely and efficiently decoded fromf (x, r), for any x and for a uniformly random r. In addition, there is an efficient procedure for outputtingf (x, r), given only f (x). That is to say thatf (x, r) contains no more information about f (x) than f (x) itself. The formal definition of randomized encodings is given in Subsection 2.4. It was shown in [AIK04] that all functions computable by log-depth circuits admit randomized encodings that can be evaluated in constant depth. However, this doesn't immediately imply that a quantum prover can coherently evaluate these encodings in constant depth. The reason is that these circuits will typically use gates of unbounded fan-out. These are gates that can create arbitrarily-many copies of their output. But the gate set one typically considers for quantum computation has only gates of bounded fan-out (single-qubit rotations and the two-qubit CN OT , for instance). How then can the prover evaluate the randomized encoding in constant depth with gates of bounded fan-out?
The key observation is that we do not require the prover to be able to evaluatef coherently on an arbitrary input, merely on a uniform superposition over classical inputs. One of our main results is then the following: There is a strategy consisting of alternating constant depth quantum circuits and logarithmic-depth classical circuits for preparing the state: up to an isometry, for anyf that can be evaluated by a constant-depth classical circuit, potentially including unbounded fan-out gates.
To prove this result, we use an idea from the theory of quantum error-correction. It is known that cat states (also known as GHZ states) cannot be prepared by a fixed constant-depth quantum circuit [WKST19]. However, if we can interleave short-depth quantum circuits (and measurements) with classical computation, it is possible to prepare cat states in constant quantum-depth. This is akin to performing corrections in quantum error correction, based on the results of syndrome measurements.
In our case, this works as follows. First, prepare a poor man's cat state in constant depth, as described in [WKST19]. This is a state of the form where w is a string in {0, 1} n and with X denoting the Pauli-X qubit flip operation. As explained in [WKST19], the constant-depth preparation of the poor man's cat state involves a measurement of the parities of neighboring qubits. In other words, the measurement yields the string z ∈ {0, 1} n−1 , with . Using a log-depth classical circuit, this parity information can be used to determine either w or its binary complement. One then applies the correction operation X(w) to the poor man's cat state, thus yielding the desired cat state Having multiple copies of cat states, it is possible to replicate the effect of unbounded fan-out classical gates on a uniform input 3 . To see why, consider the following example. Suppose we have a classical AND gate, having fan-out n. On inputs a, b ∈ {0, 1}, it produces the output c ∈ {0, 1} n , with c i = a ∧ b, for all i ∈ [n]. To perform the same operation with bounded fan-out gates, it AND c i =ā i ∧b ī a n b n AND c n =ā n ∧b n Figure 1: The left-hand side shows an AND gate with fan-out n. The right-hand side is its bounded fan-out equivalent. Hereāi = a andbi = b. Gates of unbounded fan-out can be implemented with bounded fan-out as long as sufficient copies of the inputs are provided.
suffices to have n copies of a and b. That is, , one can compute c i =ā i ∧b i using n parallel AND gates. This is illustrated in Figure 1. In our case, each input qubit to the classical function is of the form 1 √ 2 (|0 + |1 ). Replacing it with n copies is equivalent to using a cat state 1 As mentioned, the prover can prepare cat states in constant depth using the "measure-and-correct" trick. It then follows that the prover can also prepare the state where each bit of x is encoded as a cat state having the same number of qubits as the number of input copies required to evaluate f with bounded fan-out gates.
With the ability to prepare the state from 1 (or one equivalent to it, such as the one from 5) in constant quantum depth, the honest prover can then proceed to perform the rest of the steps in the proof of quantumness protocols outlined above. It will measure the image register and report the result to the verifier. The remaining operations can also be performed in constant depth. For the preimage test, the prover simply measures the x register in the computational basis and reports the result. For the equation test, the prover needs to first apply a layer of Hadamard gates to the x register before measuring it in the computational basis. Lastly, for the Bell-type measurement required in the protocol of [KMCVY22], a slightly more involved procedure is used to perform the measurement in constant depth. All of these steps are described in detail in Subsection 3.1.
While we have outlined a procedure for the prover to perform its operations in constant quantum depth, using a randomized encoding of a TCF, it is not immediately clear if we need to also modify the verifier's operations. Indeed, one question that is raised by this approach is whether a randomized encoding of a TCF preserves all the properties of a TCF. If, for instance, the trapdoor property is not preserved, the verifier would be unable to check the prover's responses in the equation test. Our second result resolves this issue: Theorem 1.2 (informal). A randomized encoding of a (strong) TCF is a (strong) TCF.
This theorem implies that substituting the TCFs used in proofs of quantumness with randomized encodings will not affect the soundness of those protocols. The proof can be found in Subsection 3.2. A similar result was derived in [AIK04], where the authors show that randomized encodings of cryptographic hash functions are also cryptographic hash functions. A (strong) TCF is different, however 4 . To prove this result, first note that most of the TCF properties follow almost immediately from the definition of a randomized encoding. The more challenging parts concern the existence of a trapdoor and the adaptive hardcore bit property. To show these, we require that the randomized encoding satisfies a property known as randomness reconstruction [AIK04]. This states that whenever there is an efficient procedure to invert the original function, f , there should also be an efficient procedure for invertingf . In particular, this means that givenf (x, r) it is possible to recover both the input x and the randomness r. In [AIK04], it's mentioned that the randomized encodings used to "compress" functions to constant depth do satisfy the randomness reconstruction property, but no proof is given. We provide a proof in Appendix B.
With the two results of Theorems 1.1 and 1.2, we have that any proof of quantumness using a log-depth computable TCF can be compiled to constant quantum depth for the prover. All of the results for this construction are presented in Section 3, and in Subection 3.3 we give a detailed account of the resources required for the prover to perform this evaluation.

Second construction -Phase encoding and learning with rounding
The second solution to the problem comes from an attempt to directly parallelize the coherent evaluation of the TCF based on LWE, hence to implement the protocol in [BCM + 18] in constant quantum depth. We start with the observation that the TCF based on LWE contains only mod-q matrix multiplication and mod-q vector addition operations, where q ∈ N is the field size. Since the phases of quantum states have the same periodicity property as the "mod-q" operation, it is natural to consider implementing the mod-q arithmetic with phase Z-rotations (R z and Controlled-R z gates). In the standard basis, the R z operation is expressed as Note that, for a given cat state, |ψ = 1 √ 2 (|0 + |1 ), applying two R z phase rotations on distinct qubits results in the phases being added into the relative phase of the state. Specifically, if we were to rotate qubit i by θ i and qubit j by θ j we would obtain By taking θ i = 2πa q and θ j = 2πb q , with a, b ∈ Z q , we can see that the net effect is a state with a relative phase proportional to (a + b) mod q, The key idea is that because these operations commute, they can be implemented in parallel by acting on distinct qubits, yielding a constant depth circuit for performing mod-q arithmetic in phase. We denote the state in Equation 6 as |φ(a + b) 5 and refer to it as a phase encoding of a + b. Encoding the values of the LWE-based TCF in phase seems to introduce a problem for the protocol. Recall that in the standard proof of quantumness protocol (outlined in Subsection 1.1) the prover encodes evaluations of the function f in the computational basis. If these values were instead encoded in phase, how would the prover be able to obtain an evaluation, y, of the function? To overcome this obstacle, we consider a different TCF based on a problem known as learning with rounding (LWR) [BPR12,AKPW13]. This problem is equivalent to LWE (for most parameter choices) and was already suggested as a candidate for building TCFs in [BCM + 18]. Specifically, denoting now as f an LWR-based TCF, we take 5 Strictly speaking the notation will refer to states with a relative phase of 2πi(a+b) q − π 2 , for reasons that will become clear later. Additionally, when using this notation we will always assume the phases are multiples of the q'th roots of unity as in the example outlined above.
where A ∈ Z m×n q , b ∈ {0, 1}, x, s ∈ Z n q and e ∈ Z m p are vectors and · p denotes rounding over p. By rounding we mean taking the most significant log 2 p bits of the result 6 . In this case, the result is a vector and the rounding is performed component-wise, so that the output is a vector with entries in Z p . Note that all matrix multiplications and additions are performed modulo q with q p. Intuitively, for small values of e, a typical claw of the function should be (0, x) and (1, x − s). This is due to the fact that the rounding operation takes the most significant bits of the output, which are unlikely to be changed when adding a vector e with small entries, component-wise. We refer the reader to the preliminaries in Section 2 for a more detailed explanation of the function and its parameters.
Returning to the idea of the phase encoding, we can now begin to see the reason for choosing this LWR-based function. Consider for the moment the function before rounding, Suppose we were to perform a phase encoding of the entries of this function, which we denote as |φ(b, x) . Now take the i'th entry of that encoding, . It is not difficult to see that if we were to measure |φ i (b, x) in the Hadamard basis (or in this case, measure the operator XX...X, as we have a rotated cat state), the outcome is most likely to be the most significant bit of g i (b, x). Similarly, if in the phase encoding we used the q/2 roots of unity, instead of the q roots of unity, a Hadamard measurement of the encoding would likely yield the second most significant bit. Repeating this log 2 p times we have a way of probabilistically recovering the output f i (b, x) = g i (b, x) p . Of course, due to the probabilistic nature of the measurement, the chance that all bits are recovered correctly will be small. To remedy this issue, we use a classical repetition code. In other words, we view each component of g(b, x) as being repeated several times. When the prover eventually performs its measurements to recover f (b, x) it will take a majority vote for each component. We find that by choosing a suitably large number of repetitions we can make it so that the prover succeeds in evaluating f (b, x) in this way with overwhelming probability.
Our main result is then the following: To prove this result, we first need to show that the function f indeed satisfies the properties of a strong TCF. The formal proof of this fact can be found in Subsection 4.1, which is mainly about showing the adaptive hardcore bit property, as all other properties are fairly straightforward.
We next discuss the protocol itself, which is essentially unchanged from that of [BCM + 18], except that it uses the LWR-based TCF. Additionally, what changes will be the prover's honest strategy for coherently evaluating this TCF. As mentioned, for this rounding-based function it is possible to coherently evaluate the function in phase, leading to a state that is equivalent (up to an isometry) to To ensure that all mod-q operations, required to prepare this state, can be performed in parallel, the cat states that serve as the basis for the phase encoding must have Ω(n log q) qubits. Here, n represents the n rows of the matrix A and since each component is modulo q, this also contributes a multiplicative log q factor. As mentioned, we also need to repeat each component in order to guarantee that measurements of the phase-encoded Z register yield a valid image with high probability. We find that the number of repetitions must be Ω(n 4 log 2 n) to have a small probability of incorrectly decoding from measurement.
Lastly, we show that the state in the preimage registers, BX, has high overlap with a superposition of preimages, as in the standard version of the protocol. The proof of this fact is based on the observation that while the states |φ(b, x) and |φ(b , x ) are not exactly orthogonal whenever ((b, x), (b , x )) does not constitute a claw, they are sufficiently close to orthogonal for most choices of the matrix A. More specifically, we can show that if A is uniformly sampled 7 from Z m×n q , the overlap between distinct |φ(b, x) states decays exponentially in m. On the other hand, if ((b, x), (b , x )) does form a claw, we can show that the overlap of |φ(b, x) and |φ(b , x ) is negligibly close to 1. From these facts and the trace-preserving nature of the operations involved, it follows that the state in the preimage register will have high overlap with a superposition of preimages, upon the prover measuring the image register, Z.
An important observation about this construction is that it requires one to perform phase rotations in increments of 2π q . While such rotation operations are already native to most existing quantum computing architectures, it is also possible to use a constant-size gate set at the expense of making the circuit polynomially wider. This is achieved by approximating the rotation gates to within inverse-polynomial error through the repetition of a fixed set of rotations (see Remark 3.5 in [HŠ05]).
Our second construction is thus an instantiation of the protocol in [BCM + 18] with an LWRbased TCF and having the prover perform a phase-encoded evaluation of that function. The main appeal of this construction is that it is much simpler than the generic construction from the previous section and achieves circuits with fewer qubits. Specifically, as computed in Subsections 3.3 and 4.4, for a security parameter λ > 0, the generic construction uses O(λ 33 ) qubits, whereas the LWRbased one uses O(λ 8 log 3 λ). Additionally, the use of the repetition code and the error-correcting properties of LWR offer the scheme some level of robustness against noise. For the full details and proofs related to this construction, see Section 4.

Related work
One of the first efficient computational tests of quantum advantage was proposed in [SB09], for certifying that a quantum prover can perform instantaneous quantum polynomial-time computations (IQP). However, that test was based on a non-standard hardness assumption and it was later shown that there is an efficient classical algorithm which passes the test [KM19].
The first proof of quantumness based on LWE originated with the work of Brakerski et al. [BCM + 18]. This is the proof of quantumness based on a strong TCF outlined in the introduction. As explained there, the protocol also serves as a certifiable random number generator. A subsequent work achieved a non-interactive version of this protocol in the quantum random-oracle model [BKVV20]. Notably, in that protocol the adaptive hardcore bit property is not required, however the protocol does make use of a hash function (in addition to the TCF) modeled as a random oracle.
The second proof of quantumness we outlined, based on regular TCFs, was introduced in [KMCVY22]. There the authors achieve more efficient proofs of quantumness by removing the requirement of the adaptive hardcore bit and using TCFs having a lower circuit complexity compared to the ones based on LWE. However, as mentioned, the cost of doing this is introducing additional rounds of interaction between the verifier and the prover (in the form of the Bell-like measurement of the equation test).
In terms of constant quantum depth constructions, it is interesting to contrast our work to that of [CSV21]. There, the authors proposed a protocol for certifiable random-number generation with constant depth quantum circuits. The first difference with respect to our work is that [CSV21] do not base the soundness of their protocol on the classical intractability of some computational problem, such as LWE. Instead, the protocol assumes that the "prover" generating the randomness is a circuit of sub-logarithmic depth (showing that sub-logarithmic classical circuits would not succeed in this task). The second difference is that our protocols require interleaving constant depth quantum circuits with logarithmic depth classical computation, whereas the protocol in [CSV21] only requires the application of a constant depth quantum circuit. Finally, our protocols are interactive, whereas [CSV21] is not.
We also mention the independent work of Hirahara and Le Gall that appeared before ours and which also gives a constant-depth proof of quantumness [HG21]. Similar to our work, they also considered one of the existing proofs of quantumness and made it so that the prover could perform its operations in constant quantum depth and using log-depth classical computations. In their case, they use a technique inspired from measurement-based quantum computing to have the prover perform the coherent evaluation of the strong TCF based on LWE. Notably, their prover evaluates that function in the computational basis, unlike our LWR-based scheme which performs the evaluation in phase.
Lastly, we also point out the work of Høyer andŠpalek showing that a large class of quantum algorithms can be implemented in constant depth with quantum gates of unbounded fan-out [HŠ05]. In particular, the quantum subroutine of Shor's algorithm can be performed this way. It should then be possible to use the same trick of reproducing unbounded fan-out with bounded fan-out gates, through measurements and classical corrections, as we did for both our constructions. This would then yield a factoring algorithm that uses only constant depth quantum circuits. There are however two downsides to doing this, compared to our approach. First, the resulting algorithm would use classical circuits of supra-logarithmic depth (see also [CW00] for a discussion of this point), in contrast to the logarithmic depth circuits that we obtain [Gal22]. Second, the resulting circuits for factoring would be significantly larger compared to the circuits obtained in our constructions.

Discussion and open problems
We've shown how existing proof of quantumness protocols can be made to work with a prover that performs constant-depth quantum computations and log-depth classical computations. Thus, all protocols based on TCFs can be compiled to constant-depth versions using randomized encodings and preparations of cat states.
One potential objection to our result is the practicality of this construction. The prover must not only run constant-depth quantum circuits, but it must do so based on the outcomes of previous measurements or based on instructions from the verifier. This is similar to syndrome measurements and corrections in quantum error-correcting codes and so it might seem as if the prover must have the capability of doing fault-tolerant quantum computations. In fact this is not the case. For the protocols based on strong TCFs the number of quantum-classical interleavings -that is, the number of alternations between performing a constant depth quantum circuit followed by a logdepth classical circuit -is exactly three. The first is required for the preparation of cat states. In this case, the prover simply needs to apply X corrections conditioned on the outcomes of certain parity measurements. The prover then evaluates the randomized-encoded TCF and measures one of its registers, sending that result to the verifier. Conditioned on its response it either measures the remaining state in the computational basis or in the Hadamard basis. Similar operations are performed for the LWR-based construction. The prover, therefore, needs to do only a very restricted type of conditional operations and is only required to do this three times. Furthermore, the protocol is robust and some degree of noise is acceptable, provided Inequality 3 is violated. When using regular TCFs, in the generic compilation scheme, the protocol requires two additional quantum-classical interleavings, for a total of five. This is due to the Bell-like measurement of that protocol. In both cases, only a small number of quantum-classical interleavings are required, unlike in a fully fault-tolerant computation where many such interleavings would be required [FMMC12].
It would, of course, be desirable to have a single-round proof of quantumness with a constantdepth prover and no quantum-classical interleavings. In other words, a protocol in which the prover has to run a single constant-depth quantum circuit and the verifier is able to efficiently certify that the prover is indeed quantum. Such a result would yield a weak separation between polynomial-time classical computation and constant-depth quantum computation. Basing such a separation on just the classical hardness of LWE seems unlikely 8 . Basing it on the classical intractability of factoring or DLP seems more realistic, as those assumptions already yield a separation between polynomialtime classical computation and logarithmic-depth quantum computation [CW00]. However, it is unclear how to adapt the existing protocols which rely on this commit-and-test approach that requires at least two rounds of interaction. We leave answering this question as an interesting open problem.
Finally, the computational resources required to implement our constant-depth proofs of quantumness are still too high for existing quantum devices. In particular, the resulting quantum circuits can be prohibitively wide to be implemented on existing NISQ devices. However, as we've seen, different implementations can lead to very different qubit requirements. Rough estimates show that our generic construction requires O(λ 33 ) qubits, while the LWR-based one requires O(λ 8 log 3 λ). These substantially different estimates give us some hope that further reducing the qubit requirements is possible. Additional optimizations are likely also possible when considering specific values for the security parameter and the choice of TCF. We therefore also leave as an open problem to reduce the width of these constructions so as to make the protocols better suited for use on near-term devices.

Acknowledgements
AG is supported by Dr. Max Rössler, the Walter Haefner Foundation and the ETH Zürich Foundation.

Notation and basic concepts
We let N denote the set of natural numbers, Z the set of integers, Z q the set of integers modulo q, and R the set of real numbers. The set {0, 1} n denotes all binary strings of length n. For some binary string We denote as |v| the Hamming weight of v, which is defined as the number of 1's in v, or The xor of two bits a, b is a ⊕ b = a + b mod 2. This extends to strings so that for v, w ∈ {0, 1} n , v ⊕ w is their bitwise xor. The Hamming distance of the strings v and w is then defined as: We will also make use of the bitwise inner product of two strings, defined as: For a bit b ∈ {0, 1}, we will useb to denote a binary string consisting of copies of b. That is, b = bbb...b. The number of copies will generally be clear from the context and will otherwise be specified. We also extend this notation to binary strings. For some string v ∈ {0, 1} n ,v will denote a string in which each bit of v has been repeated.
For any finite set X, we let x ← r X denote an element drawn uniformly at random from X. The total variation distance between two density functions f 1 , For an element r ∈ Z q , its unique representative will be [r] q ∈ (−q/2, q/2) ∩ Z. Following [BCM + 18], we use the notation |r| = |[r] q |. For any vector v of n components, its l 2 -norm is defined as The Hellinger distance between f 1 and f 2 is For any discrete probability distribution p(x), its support is defined as the set of points where the distribution is positive, For a positive B ∈ R and positive integer q, the truncated discrete Gaussian distribution over Z q with parameter B is supported on {x ∈ Z q : x ≤ B} and has density We let negl(x) denote a negligible function. A function µ : N → R is negligible if for any positive polynomial p(x) there exists an integer N > 0 such that for all x > N it's the case that We sometimes abbreviate polynomial functions as poly. Throughout the paper, λ will denote the security parameter. This will be polynomially-related to the input size of all functions we consider. Consequently, all polynomial and negligible functions will scale in λ. Let Letting g i ∈ Z q with q ≥ 2, the (mod-q) phase encoding of g i is defined as In terms of quantum information, we follow the usual formalism as outlined, for instance, in [NC02]. All Hilbert spaces are finite dimensional. We use sans-serif font to label spaces that correspond to certain quantum registers. For instance, X will correspond to an n-qubit Hilbert space of inputs to a function. We also extend the bar notation from strings to quantum states. So, for instance |0 = |00...0 . The multi-qubit cat state can then be written as |ψ = 1 √ 2 (|0 + |1 ). We now recall some standard notions of classical and quantum computation. For more details, we refer the reader to [AB09,NC02].
• The notion of computational efficiency will refer to algorithms or circuits that run in polynomial time.
• We say that an algorithm (or Turing machine) is PPT if it uses randomness and runs in polynomial time. We say it is QPT if it is a quantum algorithm running in polynomial time.
• All Boolean circuits we consider are comprised of AND, OR, XOR and NOT gates.
• We say that a classical gate has bounded fan-out if the number of output wires is constant (independent of the length of the input to the circuit). Otherwise, we say it has unbounded fan-out.
• For quantum computation we assume the standard circuit formalism with the gate set {R X , R Y , R Z , H, CZ, CN OT, CCN OT } and computational basis measurements. Here, R X , R Y , R Z denote rotations along the X, Y and Z axes of the Bloch sphere. More precisely, Z}, the set of Pauli matrices. The allowed rotation angles can be assumed to be multiples of π/4. In addition, H is the Hadamard operation, CZ is a controlled application of a Pauli-Z gate, CN OT is a controlled application of a Pauli-X gate and CCN OT is a doubly-controlled Pauli-X operation, also known as a Toffoli gate. It should be noted that, apart from CCN OT , a number of the existing quantum devices can indeed perform all of these gates natively [AAB + 19, AAMA + 21, WBD + 19].
We say that a computational problem is intractable if there is no polynomial-time algorithm solving that problem. Throughout this paper we are only concerned with computational intractability for PPT algorithms. We give a simplified description of some candidate intractable problems of interest: • Factoring. Given a composite integer N , find its prime-factor decomposition. For the specific case of semiprime N = p · q, the task is to find primes p and q.
• Discrete logarithm problem (DLP). For some abelian group G, given g ∈ G and g k , with k > 0, find k.
• Learning with errors (LWE). Letting Z q be the ring of integers modulo q ≥ 2, given the matrix A ∈ Z m×n q and the vector y = As + e, with s ∈ Z n q and e sampled from a discrete Gaussian distribution over Z m q , find s. • Ring learning with errors (Ring-LWE). Letting R q be a quotient ring R q = R/qR, for some (cyclotomic) ring R over the integers, given m > 0 pairs (a i , y i ) with a i ∈ R q and y i = a i ·s+e i , i ≤ m, s ∈ R q and each e i sampled independently from a discrete Gaussian distribution over R q , find s.

Learning with rounding (LWR)
As learning with rounding is the basis for our second proof of quantumness construction, in this subsection we define the problem and state some of its essential properties, taken from [AKPW13].
Definition 2.1 (Rounding function). For integers q ≥ p ≥ 2, the p-rounding function of an integer α satisfying 0 ≤ α < q is defined as As mentioned in Subsection 1.2.2, this rounding operation is equivalent to taking the most significant log 2 p bits of α.
Definition 2.2 (The learning with rounding (LWR) assumption [AKPW13]). Suppose A ∈ Z m×n q , x ← r Z n q and u ← r Z m q , then (A, Ax p ) and (A, u p ) are computationally indistinguishable. Note that this is the decision version of LWR. There is also a search version, in analogy to LWE. The search version is: given (A, Ax p ), as above, to find x. Whenever we refer to the "learning with rounding problem" we can use the decision version or the search version interchangeably, as they are equivalent for the parameter choices we use here. Lemma 2.1 (Trapdoors for LWR [AKPW13]). There exist efficient Gen and LWRInv functions for any n ≥ 1, q ≥ 2, m ≥ O(n log q) and p ≥ O( √ mn log q). In particular, LWRInv is defined as where We also note that for the parameter choices we consider throughout this paper, which are essentially the same as the ones in [BCM + 18] (that is, m, n, q, e ∞ as functions of the security parameter), LWE and LWR are computationally equivalent. In other words, there exists a polynomial-time reduction from LWE to LWR and vice-versa. We refer the reader to [BPR12,AKPW13] for the details.

Trapdoor claw-free functions
Most proof of quantumness protocols are based on trapdoor claw-free (TCF) functions or noisy trapdoor claw-free functions (NTCF). We start with definition of a TCF, taken from [KMCVY22].
Definition 2.4 (TCF family [KMCVY22]). Let λ be a security parameter, K a set of keys, and X k and Y k finite sets for each k ∈ K. A family of functions is called a trapdoor claw free (TCF) family if the following conditions hold: 1. Efficient Function Generation. There exists a PPT algorithm Gen which generates a key k ∈ K and the associated trapdoor data t k : 2. Trapdoor Injective Pair. For all keys k ∈ K, the following conditions hold: (a) Injective pair: Consider the set R k of all tuples (x 0 , x 1 ) such that f k (x 0 ) = f k (x 1 ). Let X k ⊆ X k be the set of values x which appear in the elements of R k . For all x ∈ X k , x appears in exactly one element of R k ; furthermore, lim λ→∞ |X k |/|X k | = 1. (b) Trapdoor: There exists a polynomial-time deterministic algorithm INV F such that for all y ∈ Y k and ( 3. Claw-free. For any non-uniform probabilistic polynomial time (nu-PPT) classical algorithm A, there exists a negligible function µ(·) such that where the probability is over both the choice of k and the random coins of A.

Efficient Superposition.
There exists a polynomial-size quantum circuit that on input a key k prepares the state 1 Next, we define the notion of a noisy TCF, first introduced in [Mah18, BCM + 18]. These are TCFs for which the efficient superposition is allowed to be approximate, rather than exact. The outputs of these functions are additionally assumed to be distributions over binary strings, rather than just binary strings. NTCFs, as defined in [BCM + 18], also satisfy a property known as the adaptive hardcore bit which is independent of the "noisy" aspect of the TCF. As we want to distinguish between TCFs which satisfy this property and those that do not satisfy it, we shall refer to the former as strong TCFs and the latter as ordinary TCFs, as per Definition 2.4. Thus, the NTCFs we consider will be referred to as strong NTCFs: Definition 2.5 (Strong NTCF Family [BCM + 18]). Let λ be a security parameter. Let X and Y be finite sets and D Y a collection of distributions over Y. Let K F be a finite set of keys. A family of functions is called a strong noisy trapdoor claw-free (strong NTCF) family if the following conditions hold: 1. Efficient Function Generation. Same as in Definition 2.4.

Efficient Range Superposition. For all keys
x ∈ X and y ∈ Y, returns 1 if y ∈ Supp(f k,b (x)) and 0 otherwise. Note that CHK F is not provided the trapdoor t k .
(c) For every k and b ∈ {0, 1}, for some negligible function µ(·). Here H 2 is the Hellinger distance. Moreover, there exists an efficient procedure SAMP F that on input k and b ∈ {0, 1} prepares the state 4. Adaptive Hardcore Bit. For all keys k ∈ K F the following conditions hold, for some integer w that is a polynomially bounded function of λ.
is negligible, and moreover there exists an efficient algorithm that checks for membership in G k,b,x given k, b, x and the trapdoor t k .
then for any quantum polynomial-time procedure A there exists a negligible function As a point of clarification, note that a noisy TCF (NTCF) is a TCF with a modified efficient range superposition property. A strong NTCF is a NTCF with the adaptive hardcore bit property. As mentioned, in [Mah18, BCM + 18] NTCFs are not distinguished from strong NTCFs. As an abuse of notation, we will use NTCF and strong NTCF interchangeably.

The BCMVV protocol
The first protocol we mention is the one from [BCM + 18], which relies on the adaptive hardcore bit property and so the function family used is NTCF. We outlined the protocol in the introduction, while here we give a step-by-step description of its workings, in Figure 2. The protocol is complete, in the following sense:

BCMVV protocol
Let F be an NTCF family of functions. Let λ be a security parameter and N ≥ 1 a number of rounds. The parties taking part in the protocol are a PPT machine, known as the verifier and a QPT machine, known as the prover. They will repeat the following steps N times: 1. The verifier generates (k, t k ) ← Gen(1 λ ). It sends k to the prover.
2. The prover uses k to run SAMPF and prepare the state: It then measures the Y register, resulting in the string y ∈ {0, 1} poly(λ) which it sends to the verifier.  Theorem 2.1 ([BCM + 18]). A QPT prover, P, following the honest strategy in the BCMVV protocol is accepted with probability 1 − negl(λ).
The soundness of the protocol against classical provers follows from the following theorem: Theorem 2.2 ([BCM + 18, ZKML + 21]). For any PPT prover, P, in the BCMVV protocol, it is the case that where p pre is P's success probability in the preimage test and p eq is P's success probability in the equation test.
Thus, in any run of the protocol, as long as Inequality 17 is violated, we conclude that the prover is quantum.
One known instantiation of the BCMVV protocol, as is described by [BCM + 18], is based on the LWE problem. The LWE-based construction is currently the only known instance of a strong NTCF family of functions.

The KMCVY protocol
The BCMVV protocol relies on the adaptive hardcore bit property of NTCFs in order to be sound. However, this property is only known to be true for NTCFs based on LWE. The authors of [KMCVY22] addressed this fact by introducing a proof of quantumness protocol that can use any TCF. As mentioned in the introduction, their protocol is a sort of computational Bell test. We outline it in Figure 3. The protocol is complete, in the following sense: . A QPT prover, P, following the honest strategy in the KMCVY protocol is accepted with probability 1 − negl(λ).
The soundness of the protocol against classical provers follows from the following theorem:

KMCVY protocol
Let F be a TCF family of functions. Let λ be a security parameter, N ≥ 1 a number of rounds and T = 1/poly(λ) a threshold parameter. The parties taking part in the protocol are a PPT machine, known as the verifier and a QPT machine, known as the prover. Before interacting with the prover, the verifier initializes two counters Ns = 0, Nt = 0. The two will then repeat the following steps N times: 1. The verifier generates (k, t k ) ← Gen(1 λ ). It sends k to the prover.
2. The prover uses k to prepare the state: It then measures the Y register, resulting in the string y ∈ {0, 1} poly(λ) which it sends to the verifier.
iii. The prover applies Hadamard gates to all qubits in the X register and measures them in the standard basis. The measurement outcome is denoted d ∈ {0, 1} n and is sent to the verifier. iv. The verifier computes (x0, x1) = InvF (t k , y). Together with d, the verifier can determine the current state |γ A ∈ {|0 , |1 , |+ , |− } in the prover's A register. It then chooses a random φ ∈ {π/4, −π/4} and sends it to the prover. v. The prover is expected to measure the qubit in the A register in the basis: vi. The verifier sets Ns ← Ns + 1 if the measurement outcome was the likely one.
If the verifier has not aborted, it will accept if Ns N t − 0.75 ≥ T .

Theorem 2.4 ([KMCVY22]
). For any PPT prover, P, in the KMCVY protocol, it is the case that where p pre is P's success probability in the preimage test and p Bell is P's success probability in the computational Bell test.
Thus, in any run of the protocol, as long as Inequality 18 is violated, we conclude that the prover is quantum.
In [KMCVY22], the authors provide the following candidate TCFs: • Rabin's function, or x 2 mod n. The TCF properties are based on the computational intractability of factoring.
• A Diffie-Hellman-based function. The TCF properties are based on the computational intractability of DLP.
• A ring-LWE-based function. The TCF properties are based on the computational intractability of ring-LWE.
Of course, the NTCF family based on LWE can also be used.

Randomized encodings
Randomized encodings (also known as garbled circuits [Yao86]) are probabilistic encodings of functions that are information-theoretically equivalent to the functions they encode. The idea of constructing randomized encodings which can be evaluated in constant depth originated with [AIK04]. We restate here the essential definitions and results from that paper. • Efficient generation. There exists a deterministic polynomial-time algorithm that, given a description of the circuit implementing f , outputs a description of a circuit for implementinĝ f .
A perfect randomized encoding is one for which δ = 0 (perfect correctness) and = 0 (perfect privacy). Note that for perfect encodings f (x) can always be reconstructed fromf (x, r). Additionally, perfect privacy means thatf (x, r) encodes as much information about x as f (x). An important property of perfect encodings that we will use is that of unique randomness: Theorem 2.5 (Unique randomness [AIK04]). Supposef is a perfect randomized encoding of f . Then for any input x, the functionf (x, ·) is injective; namely, there are no distinct r,r such that f (x, r) =f (x, r ). Moreover, if f is a permutation, then so isf .
The main result in [AIK04] is the following: Theorem 2.6 ([AIK04]). Any Boolean function that can be computed by a log-depth circuit, admits a perfect randomized encoding that can be computed in constant depth.
In fact a more general result is shown in [AIK04], however the result of the above theorem is sufficient for our purposes. We also require the following result: Lemma 2.2 (Randomness reconstruction). Given x andf (x, r), wheref is a randomized encoding following the construction from [AIK04], there is a deterministic polynomial-time algorithm, denoted Rrc, for computing the randomness r.
Note that this property is not universal to randomized encodings, in that it cannot be derived from the definition of randomized encodings. However, the property is satisfied by the specific encodings defined in [AIK04]. This fact is mentioned in [AIK04], however no formal proof is provided. We outline their construction in Appendix A and prove the randomness reconstruction property in Appendix B.
Proof. Perfect privacy says that there exists a polynomial-time simulator S, such that for all x, it should be that TVD(S(f (x)),f (x, r)) = 0, where TVD is the total variation distance and r is sampled uniformly at random. Essentially, S should always be able to sample from the set of randomized encoding values that can be decoded to f (x) (i.e. allf (x, r), for all r).
But now suppose we have x 1 and x 2 such that f (x 1 ) = f (x 2 ). By perfect privacy it must be that TVD(S(f (x 1 )),f (x 1 , r 1 )) = 0 and TVD(S(f (x 2 )),f (x 2 , r 2 )) = 0, for uniform r 1 and r 2 . Since f (x 1 ) = f (x 2 ), it must be that TVD(f (x 1 , r 1 ),f (x 2 , r 2 )) = 0. In other words,f (x 1 , r 1 ) andf (x 2 , r 2 ) are the same distribution (for random choices of r 1 and r 2 ) and so the randomized encodings that can be decoded to f (x 1 ) = f (x 2 ) are the same for both x 1 and x 2 .
Moreover, unique randomness (Theorem 2.5) ensures that there are no distinct r 1 and r 1 such thatf (x 1 , r 1 ) =f (x 1 , r 1 ) (with the analogous statement holding for the x 2 case). Thus, for uniform r 1 ,f (x 1 , r 1 ) is the uniform distribution over all randomized encodings which decode to f (x 1 ) = f (x 2 ). Asf (x 2 , r 2 ) is the same distribution (for uniform r 2 ), it is the case that there are unique r 1 and r 2 such thatf (x 1 , r 1 ) =f (x 2 , r 2 ). This shows the first part of the lemma, that for every x 1 , x 2 with x 1 = x 2 for which f (x 1 ) = f (x 2 ) there exist unique r 1 and r 2 such that f (x 1 , r 1 ) =f (x 2 , r 2 ).

Generic proofs of quantumness in constant quantum depth
We now have all the tools for presenting our generic compiler which can take the two proof of quantumness protocols from Subsection 2.3 and map them to equivalent protocols in which the prover's operations require only constant quantum depth and logarithmic classical depth. The idea is the following: provided the (N)TCF of the original protocol can be evaluated in log depth, simply replace it with a constant-depth randomized encoding, as follows from Theorem 2.6. In other words, y = f (x) should be replaced byŷ =f (x) wherex = (x, r) and r denotes the randomness of the encoding. As mentioned, it was shown in [GH20,KMCVY22] that the (N)TCFs of the two proofs of quantumness considered here, can indeed be performed in classical logarithmic depth. Thus, to show that our construction works, we prove two things: 1. The prover can evaluatef coherently in constant quantum depth (as well as perform its remaining operations in constant depth). This is the completeness condition of the protocol shown in Subsection 3.1.

2.
A randomized encoding of a (N)TCF is itself a (N)TCF. This means that the modified protocol is sound against classical polynomial-time provers. We show this in Subsection 3.2.

Completeness
To show completeness, we give a strategy for an honest prover, that interleaves constant-depth quantum circuits and log-depth classical circuits, to succeed in the proofs of quantumness described in Section 3. We assume that the (N)TCFs used in those protocols can be evaluated in constant classical depth and denote the corresponding function asf k . These circuits are allowed to contain gates of unbounded fan-out. We can always map such a circuit to one that uses only gates of bounded fan-out, provided multiple copies of the input bits are provided. The intuition for this was mentioned in the Introduction and in Figure 1. We will assume each input bit of the initial circuit has been copied k times. The first step is preparing the state corresponding to a coherent evaluation of the (N)TCF over a uniform superposition of inputs: where the B and X registers store the inputs off and the Y register will store the computed value off . As a slight abuse of notation, we omit the normalization term and assume the state is an equal superposition. Instead of preparing the state in Equation 19, we will prepare a state that is essentially equivalent to it, namely: where |b = |b ⊗k and |x = |x ⊗k . We view the X register as consisting of multiple sub-registers, one for each bit in x. In other words 9 , if x = x 1 x 2 ...x n , with n(λ) = poly(λ), andx =x 1x2 ...x n , we assume X = X 1 ⊗ X 2 ⊗ ... ⊗ X n . Here, X i holds the state xi∈{0,1} |x i .
The prover starts by preparing: Note that the B and X registers contain cat states. These can be prepared in constant quantum depth, together with logarithmic classical depth. As outlined in the introduction, the idea is to first prepare a poor man's cat state in constant depth, as described in [WKST19]. The prover then uses the parity information from the prepared poor man's cat state to perform a correction operation consisting of Pauli-X gates. Determining where to perform the X gates from the parity information requires logarithmic classical depth. The X corrections will map the poor man's cat states to cat states. Next, the functionf needs to be evaluated and the outcome will be stored in Y register. With multiple copies of the input, the circuit evaluatingf consists only of gates with bounded fan-out. It can therefore be mapped to an equivalent constant depth quantum circuit (having twice the depth, so as to perform the operations reversibly) consisting of Toffoli, Pauli-X and CN OT gates. Evaluating this circuit on the state from 21 will result in the state from 20, as intended.
The prover is then required to measure the Y register and report the outcome to the verifier. This adds one more layer to the circuit. The measured state will collapse to In the preimage test, the prover will also measure this state in the computational basis and report the outcome to the verifier.
The next steps will differ for the two protocols.
1. For the BCMVV protocol: In the equation test, the prover applies a layer of Hadamard gates on the qubits in B and X. It then measures them in the computational basis, denoting the results as b ∈ {0, 1} k and d ∈ {0, 1} n·k . In the original protocol, b was one bit and d was n bits and they satisfy the relation d · (x 0 ⊕ x 1 ) = b . To arrive at that result, the prover will xor all the bits in b and all bits in each k-bit block of d and report those results to the verifier. Note that the distributions of these xor-ed outcomes is the same as the distribution over the outcomes of a Hadamard-basis measurement of: 2. For the KMCVY protocol: In the computational Bell test, the prover receives the string v from the verifier. The original protocol has the prover use an ancilla qubit to store the bitwise inner product v · x b . However, such a multiplication requires serial CN OT gates which cannot be performed in constant depth. We therefore use a multi-qubit ancila register initalized as a cat state |a A = |0 ⊗n +|1 ⊗n √ 2 . For every bit v i , in v, if v i = 1, the prover applies 9 Note that this is the only place where a subscript on x is used to denote a bit of x. Throughout the rest of the section, x b will denote a specific x string, and does not refer to the b'th bit of the string x. a controlled-Z (CZ) gate with control qubit any of the qubits in X i and target qubit |a i . The resulting state will be b∈{0,1} . Next, the prover is required to measure X in the Hadamard basis yielding the result d ∈ {0, 1} n·k . Once again, in the original protocol d is an n-bit string. As in the BCMVV protocol, this is "fixed" by having the prover xor each k-bit block of d and report those outcomes to the verifier. The verifier can then use this result to determine the state in the ancilla register.
After the measurement, the ancilla register will be in the state |γ A ∈ {|0 , |1 , |+ , |− } where |± = |0 ±|1 To perform the reduction, the prover first measures all but one qubit of |γ A in the Hadamard basis. Denote this (n−1)-bit outcome as w. If the initial state was |0 or |1 , the unmeasured qubit will be |0 or |1 respectively. If the initial state was |± , it can be re-expressed as Thus, the qubit after the measurement will be Z |w| mod 2 |± . The prover will apply the Z |w| mod 2 operation to this qubit. In this way, the state |± is reduced to |± .
Finally, the prover has to measure the qubit in the rotated basis and report the outcome. This can be done in constant depth by rotating the qubit appropriately and measuring in the standard basis. As in the original protocol, this prover will pass the verifier's checks with probability cos(π/8) 2 ≈ 85%.

Soundness
We do not need to prove soundness from scratch for our modified protocols. Instead, since our only change was to replace the (N)TCFs used in the protocols with randomized encodings, we will have the same soundness as the original constructions provided randomized encodings of (N)TCFs are still (N)TCFs. That is what we show here.
Theorem 3.1. A perfect randomized encoding of a (N)TCF, satisfying the randomness reconstruction property, is still a (N)TCF.
Proof. We show this result for NTCFs specifically, since the TCF case is subsumed. The idea of the proof is to show that every property of a NTCF is also satisfied by its randomized encoding.
1. Efficient Function Generation. By definition, randomized encodings can be efficiently generated given a description of the function to be encoded. In this case, the description is given by the public key produced by the PPT algorithm GEN F . More precisely, GEN F generates the key k ∈ K F together with a trapdoor t k . The generating procedure for the encoding will run GEN F and output k, the efficient circuit for generating a randomized encoding and the trapdoor t k . Schematically,

Trapdoor Injective Pair.
(a) Trapdoor: Due to perfect correctness, r 1 )) = ∅, then perfect correctness leads to Supp(f k,b (x 0 )) ∩ Supp(f k,b (x 1 )) = ∅ which violates the trapdoor injective pair property of the original function f . The efficient deterministic algorithm for inverting the randomized encoding also exists and is defined as i.e. the composition of the decoding operation for the encoding, the original Inv F procedure of the NTCF and the randomness reconstruction procedure (see Lemma 2.2).
(b) Injective pair: LetR k be the set of all tuples of the form ((x 0 , r 0 ), (x 1 , r 1 )) such that f k,0 (x 0 , r 0 ) =f k,1 (x 1 , r 1 ). Additionally, letX k ⊆X k be the set of values (x, r) which appear in the elements ofR k . It is the case that every (x, r) ∈X k appears in exactly one element ofR k . This is because, using the collision-preservation property (Lemma 2.3), it must be thatf k,0 (x 0 , r 0 ) =f k,1 (x 1 , r 1 ) only if f k,0 (x 0 ) = f k,1 (x 1 ) and only for unique r 1 and r 2 . We also know from the injective pair property of f k,b , that every x appears in exactly one tuple defining a collision for f k,b .
Also note that |X k | = 2 m |X k |, where |r| = m. In other words, the set of possible inputs forf k,b is 2 m times larger than that of f k,b , as for every input, x, we also have the m-bit string r. The collision preservation property (Lemma 2.3) also ensures that |X k | = 2 m |X k |. Since we know that lim λ→∞ |X k |/|X k | = 1 it also follows that lim λ→∞ |X k |/|X k | = 1.
3. Efficient Range Superposition. The efficient range superposition property of the original function f means there's an efficient quantum procedure to create a state approximating a superposition over the range of f . Assume we add an additional register, R, to represent the randomness of the encoding,f , and initialize it as a uniform superposition over computational basis states. We can now combine the efficient procedure for generatingf with the procedure for generating the range superposition of f and apply them coherently on R. This will then yield the desired state x,r,y (f k,b (x, r))(y) |x |r |y , suitably normalized.
4. Adaptive Hardcore Bit. We prove this property by contradiction. Assume there exists a QPT adversaryÂ that breaks the adaptive hardcore bit property for the randomized encoding. This means that there exists a non-negligible function p(λ) that satisfies Note that the output ofÂ is a tuple (b,x b ,d,d · (x 0 ⊕x 1 )). One can now define a new QPT adversary A which runsÂ and then outputs (b, x b , d x ,d · (x 0 ⊕x 1 ) ⊕ (d r · (r 0 ⊕ r 1 ))). This then implies that Hence, the adaptive hardcore bit of the original NTCF family is violated. We conclude that the randomized encoding must also satisfy the adaptive hardcore bit property.

Resource estimation
In this section, we give some estimates of the resources required to run our modified protocols. We summarize this information in Table 1 and proceed to explain the results. The functions listed in the table are the same as the ones from [KMCVY22], as these are the existing candidate TCFs used in proof of quantumness protocols.  means hardcore bit. The number of quantum-classical interleavings refers to the instances where the prover performs a constant-depth quantum circuit followed by a classical computation. This is done, for instance, in the preparation of cat states as well as when it responds to one of the verifier's challenges. Depth refers to the total number of layers of quantum gates that the prover has to perform. Width refers to the width of the quantum circuits the prover has to implement. Here, λ denotes the security parameter and l is the size of the branching program implementing the randomized encoding, as described in Appendix A.

Quantum depth and quantum-classical interleavings
In this subsection we explain the overall quantum depth that the prover has to perform in our modified proofs of quantumness. Depth here represents the number of layers of quantum gates or measurements (as described in Section 2) that the prover will perform throughout the protocol, in the worst case. As mentioned, the prover's operations consist of alternating between constantdepth quantum circuits and log-depth classical computation. This latter step we referred to as a quantum-classical interleaving. For the NTCF-based protocol which uses LWE, the total quantum depth is 14 and 3 quantumclassical interleavings are performed, whereas for the TCF-based approaches the depth is 17 and the number of interleavings is 4. Let us explain where these numbers come from: 1. Preparation of cat states. As mentioned, we prepare cat states by interleaving a constant depth quantum circuit with a log-depth classical computation, followed by another quantum circuit. The exact steps are outlined in [WKST19], while here we just summarize the gates performed in each step. The procedure starts with a layer of Hadamard gates followed by two layers of CN OT gates. Some of the qubits are then measured in the computational basis. The remaining qubits will collapse to a poor man's cat state, while the measured qubits contain the parity information for that state. To "correct" the state to a cat state, the parity information is used to compute a Pauli-X correction. This is one quantum-classical interleaving. The final quantum layer consists of Pauli-X gates. Thus, the total depth will be 5 and we have 1 quantum-classical interleaving. This applies to all cat states, as they can be prepared in parallel.
2. Evaluation of the randomized encoded function. As illustrated can see in Figure 8, the classical circuit for a randomized encoding has depth 3. In the quantum case, the AND gates are implemented by Toffoli gates and the XOR gate is a CN OT . As the quantum gates are reversible, one needs to uncompute any auxiliary results and so the quantum depth will be double that of the classical circuit. Hence, for this step the quantum depth is 6 and there are no quantum-classical interleavings.
3. Measurement of the Y register. Measuring the image register requires a layer of computational basis measurements and so the depth is 1. The results are read out and sent to the verifier, which we count as 1 quantum-classical interleaving.

Preimage test or equation/Bell test.
If a preimage test is performed, the prover only needs to measure the X register in the computational basis and report the result. This counts as depth 1 and 1 interleaving. In the NTCF protocol, if an equation test is performed, then the prover is expected to apply a layer of Hadamard gates to the X register and measure them. This counts as depth 2 and 1 interleaving. In the TCF protocol, when the computational Bell test is performed, the prover's operations (as outlined in Subsection 3.1) will consist of a layer of CZ gates, a layer of Hadamard gates together with a computational basis measurement, a classical computation and reporting the results to the verifier, a Pauli-Z operation, a rotation gate and finally another measurement and reporting the results to the verifier. This counts as depth 6 and 2 interleavings.

Circuit width
The constant-depth versions of the proof of quantumness protocols require larger numbers of qubits than the original version. As explained, most of this is due to the use of cat states, which effectively copy the input and allow us to apply a constant depth circuit with bounded fan-out gates. That circuit is a randomized encoding of the original TCF. Following the construction of randomized encodings from [AIK04] and described in Appendix A, the width of the constant-depth circuit will depend on the size of the branching program used to evaluate the original function. In Appendix A we explain how, as a result of Barrington's theorem, the size of this branching program is exponential in the depth of the original TCF. As all TCFs considered here can be evaluated in logarithmic depth, the resulting branching programs will have sizes polynomial in the security parameter λ. Giving a precise account of the size of the branching program, as a function of λ, for each TCF, is beyond the scope of this paper. Instead, we find in Appendix A that the overall circuit width for the prover's quantum circuit is O(λl 4 ), where l is the size of the branching program used to evaluate the TCF. The λ factor comes from having to repeat the branching program construction in parallel O(λ) times. This is because one branching program computes a single output bit of the TCF and so one has to consider a different branching program (of the same size) for each output bit.
As a rough estimate, we can relate the width to the security parameter for the LWE-based NTCF of [Mah18, BCM + 18]. There we know from [GH20] that the functions can be evaluated in depth ∝ 4 log λ. From Barrington's theorem, the size l of the corresponding branching program is on the order of λ 8 . As the width is O(λl 4 ), we find that the prover requires O(λ 33 ) qubits. This is a discouraging result for the purposes of implementing these protocols on near-term devices. However, it should be noted that this was merely a rough calculation based on existing asymptotic estimates. We conjecture that these estimates are not optimal and can be improved with a tighter analysis, better circuit implementations and more compact branching programs. Additionally, for a fixed-size implementation (say λ = 50), it is likely that additional optimizations are possible that could further reduce the number of required qubits.

Proofs of quantumness via phase encoding
The first construction based on randomized encoding is a generic method that works for all types of (N)TCFs. However, as mentioned, its naive implementation based on Barrington's theorem leads to circuits which are too wide to be implemented on near-term devices.
In this section, we propose another approach that can be implemented on much narrower circuits, thus bringing it closer to implementation on near-term devices. This construction relies on phase encodings to evaluate a specific NTCF, based on the LWR problem that is defined in Subsection 2.2. As we will see, the resulting circuits also involve only constant quantum depth and logarithmic classical depth.
Before presenting the protocol, we first define the LWR-based NTCF, denoted as f , and introduce its phase encoded implementation.

LWR-based NTCF
The LWR-based NTCF was suggested in [BCM + 18] but not used. It is however used in [ZKML + 21], but without the phase encoding. The specific NTCF we consider is the following: Definition 4.1 (LWR-based NTCF). Let λ > 0 be a security parameter.
We take n(λ), m(λ), q(λ), p(λ) as functions of λ subject to the following constraints: n = O(λ), q = 2 O(n) is prime, m = Ω(n log q), and p = O( √ mn log q) is a power of 2. Additionally χ will denote a discrete Gaussian distribution over Z q having width O(q/p 5 ). Taking A ← r Z m×n q , s ← r {0, 1} n , e ← χ m Z m q (so that e ∞ = O(q/p 5 )), we define the function . For the specific constants in the parameters defined above, we use the same values as in [BCM + 18]. It should be noted that the width of the error distribution is taken to be polynomially smaller than in [BCM + 18] (O(q/p 5 ) versus O(q/p)). But since the width is still superpolynomial (in n) we are still in the "hardness regime" where both LWE and LWR are intractable. For more details, we refer the reader to the Preliminaries of [BCM + 18]. The reason for this choice will become apparent in Subsection 4.3.1.
Although we are referring to f as an NTCF, it is not clear if this is indeed the case. Following the definition from Subsection 2.3, we next show that all the properties are satisfied . As f (b, x) uses the same LWE instance as the LWE-based NTCF of [BCM + 18], we will have the same Gen, which immediately proves the efficient function generation property. Additionally, Lemma 2.1 confirms that the (k, t k ) pair sampled by Gen is also the key and trapdoor pair for the LWR-based function (for this reason we will sometimes write the function as f k ). We can also see that if (0, x) is the preimage of y = f (0, x), the other preimage is (1, x − s). The trapdoor injective pair property then follows. The efficient evaluation property comes from the fact that mod-q matrix multiplication and additions can be efficiently performed by polynomial-depth quantum circuits. In fact, the rest of this section is devoted to showing an efficient evaluation in constant quantum-depth using the phase encoding construction.
We are left with showing the adaptive hardcore bit property. As a first step, we show the following: Proof. Consider where h is the LWE-based NTCF using in [BCM + 18] and both functions are based on the same LWE sample As + e. The statement we would like to show is then re-expressed as with high probability over the choices of A, s, and e. We can prove it by showing both implications.
probability. In [BCM + 18], it was shown that h(0, x 0 ) = h(1, x 1 ) if and only if x 1 = x 0 − s, with high probability. Now take x 1 = x 0 − s + w for some non-zero w ∈ Z n q . We know that Aw is a uniformly random vector (over the random choice of A) and therefore every bit of f (1, x 1 ) has a probability of 1 2 to be flipped with respect to f (0, x 0 ). Thus, the probability of f (0, x 0 ) = f (1, x 1 ) can be bounded by the additive Chernoff inequality which is negligible. x 1 ), which immediately leads to x 1 = x 0 −s, with high probability. We then have f (0, x 0 ) = Ax 0 p and f (1, x 1 ) = Ax 0 + e . As we have e ∞ = O(q/p 5 ), the probability of f (0, Now we have all the ingredients for the proof of the adaptive hardcore bit property. Theorem 4.1. The LWR-based NTCFs (f k (b, x)) have the adaptive hardcore bit property.
Proof. We present a proof by contradiction. Suppose f k (b, x) = Ax + b(As + e) p is an LWRbased NTCF where k is the key and t k is the trapdoor, both generated by Gen. Assume there exists a QPT adversaryÂ that breaks the adaptive hardcore bit property of f . This means that there exists a non-negligible function κ(m) that satisfies x 1 ). We can then consider the LWE-based NTCF h k (b, x) := Ax + b · (As + e) + e , whose corresponding sets are denoted by H k , H k , and R k . As is shown in Lemma 4.1, we have R k =R k , with overwhelming probability, hence H k =Ĥ k and H k =Ĥ k . Therefore, we can define the QPT adversary, A :=Â. It satisfies which breaks the adaptive hardcore bit property of LWE-based NTCFs.
This implies that f (b, x) satisfies all requirements of an NTCF.

Prime q
As mentioned in Definition 4.1, we require q to be a prime. This is, in fact, also a requirement in [BCM + 18]. The reason for this is that some of the properties of these NTCF-based constructions hold only when Z q is a finite field, rather than a finite ring. Normally, this would just be a minor technical point. However, in our case since we would like to perform the prover's operations in constant depth, we would need to provide a procedure that allows the prover to prepare equal superpositions over the field elements. In other words, the prover needs to create an equal superposition of a prime number of elements. While this can be done in constant quantum depth, using cat states and ideas from [HŠ05], we will find that this is not necessary, provided q is sufficiently large and sufficiently close to a power of 2. In this section, we show that these conditions can indeed be satisfied and it is possible to efficiently choose a prime q that is close to a power of 2.
We start with a result from [Dus98]: Dus98]). For q > 3275, there exists a prime q in the interval q < q < 1 + 1 2 ln 2 q q .
This implies that the ratio of q and q = 2 n is bounded by Moreover, a specific prime in between q = 2 n and 1 + 1 2 ln 2 q q can be efficiently found. It suffices to sample random integers in the range and check if they are prime. The checking can be done by (for instance) the Miller-Rabin algorithm [Rab80], in polynomial time. We can show that the number of samples to check is O(n) using the Prime number theorem, which states that, if π(N ) is the prime counting function, for integers in the range (0, N ), then it is the case that Thus, the number of primes in the desired range can be estimated by π 1 + 1 2 ln 2 q q ∼ 2 n 1 + 1 2(ln 2) 2 n 2 n + log 1 + 1 2(ln 2) 2 n 2 ∼ 2 n 1 n + 1 2(ln 2) 2 n 3 + O(n −4 ) and π 1 + 1 2 ln 2 q q − π(q ) = 2 n 2(ln 2) 2 n 3 + O(2 n n −4 ).
Therefore, the density of primes in the range is which immediately implies that a prime can be found with an expected number of O(n) random samples. All of this is incorporated in the Gen procedure as that is responsible for choosing a suitable q. As will also be mentioned later, since q is close to a power of 2, when the prover has to create an equal superposition over the elements of Z q it will instead create the superposition over elements up to q , the nearest power of 2, larger than q. The resulting state will be sufficiently close in trace distance that we only incur a 1/poly(n) penalty in completeness for making this replacement.

Phase encoding
The concept of phase encoding was described in Section 2. In this section we will look at several properties of the phase encoding for the LWR-based NTCF (Definition 4.1). We aim to show how to evaluate g(b, x) = Ax + b · (As + e) in phase, and show that measuring the resulted state in Hadamard basis will reveal the value of f (b, x) = g(b, x) p , with high probability. It is natural to start by considering the phase encoding of g(b, x) for a specific (b, x). Note that x ∈ Z n q and g(b, x) ∈ Z m q , both being vectors. The phase encoded state that we would like the prover to prepare (for each b and x) should have the following form: and For the majority of this section, we will focus on the case p = 2. That is, we assume that f (b, x) simply takes the most significant bit of each component of g(b, x). This, of course, is not the NTCF we defined since there we had that p = O( √ mn log q). We will address the case of general p in Subsection 4.2.3.
For p = 2, we denote the output of f (b, x) = g(b, x) 2 by y, a binary string of length m. We g(b, x). Before explaining how to prepare the phase encoded state in constant depth, let us first investigate how to decode y = f (b, x) from |φ(b, x) with high probability.

Decoding by measurements
The phase encoding can be probabilistically decoded through Hadamard measurements. Denote the process of measuring the XX...X observable on the state in Equation 25 by M and the measurement outcomes of all m phase encoded states by z ∈ {0, 1} m . One can then write z ← M (|φ(b, x) ). It should be clear that z = y indicates that the decoding was completely successful.
Let us consider the case of a single component in the encoding, namely |φ i . In order to investigate the possible values of z i = M (|φ i ), |φ i can be rewritten as If the qubit is measured in the Hadamard basis, we can express the outcome probabilities as with φ i = 2πgi q − π 2 . Note that g i < q/2 is equivalent to y i = g i 2 = 0. Additionally, g i < q/2 leads to cos φ i > 0. Therefore the probability of getting + is larger than that of −. If we map + to 0 and − to 1, it is clear that the Hadamard measurement is essentially a probabilistic decoding of y i from φ i , with success probability always greater than 1 2 . More compactly, we can write the probability of measuring any z i from |φ i by Furthermore, the probability of successfully decoding φ i (i.e. z i = y i ) is denoted by where Pr(z i = y i ) = Pr M (+|φ i ) = 1 2 (1 + cos φ i ) if y i = 0 and Pr(z i = y i ) = Pr M (−|φ i ) = 1 2 (1 − cos φ i ) if y i = 1. Similarly, the probability of unsuccessful decoding is represented by We can now evaluate the expected values of these probabilities over the uniform choice of the matrix A and show the following: Lemma 4.3. Over the choice of matrix A, the average probability of successful decoding of any |φ i is 1 2 + 1 π ≈ 0.82. Proof. To clarify, there are two sources of randomness here. On the one hand we have the randomness of the measurement and on the other hand we have the random choice of the matrix A. We're interested in seeing the expected probability of a successful (as well as an unsuccessful) decoding over the choice of A. As g(b, x) = Ax + b · (As + e), we can see that if A is uniform (over a finite field), then g(b, x) will also be uniform (for any non-zero b and x). Hence, Pr(φ i ) = Pr(g i ) = 1 q for all φ i ∈ {− π 2 , 2π q − π 2 , . . . , 3π 2 }. The expected probability of a correct decoding is then which we can view as a Riemann sum. For large q, the summation converges to an integral By the change of variable φ i = 2πgi q − π 2 , this becomes We also have the expected probability of an incorrect decodinḡ The approximation S → I comes with an error which we can bound. Such an error for an (l + 1)order differentiable integrand χ can be determined with the Euler-Maclaurin formula where B k is the k-th Bernoulli number, R l = o(q −l ) is the remainder term, and χ(y i ) = 1 q (1 + cos( 2πgi q − π 2 )) is the integrand. We can see that χ (k−1) ( q 2 −1)−χ (k−1) (0) = 0 for odd k. Therefore, the error can be written as As g(b, x) is uniform (over the random choice of A and whenever (x, b) = (0, 0)), each of its components will be a uniform value in Z q . Thus, we can view the measurement of each component of |φ(b, x) to be an independent and identically distributed random variable. As the expected probability of a correct decoding is 0.82, it follows from a Chernoff bound that 0.82m values will be decoded correctly, with overwhelming probability over the choice of A. While this means that most values are correctly decoded, we, in fact, need all values to be decoded correctly with high probability. To achieve this, we use a classical repetition code and repeat each output component several times in order to take a majority vote.

Decodability and repetition code (p = 2)
Instead of the prover having to prepare |φ(b, x) (for each b and x), we will instead ask it to prepare: where v represents the number of repetitions. In this case, to decode the value of the i'th component, one measures all v copies of that component and uses the majority outcome as the value z i . We say that one component, for instance the i'th component, has been correctly decoded, if z i = y i , where recall that y i is the most-significant bit of g i (b, x). By analogy, we say that the whole state has been correctly decoded if all of its components were (i.e. z = y). Our goal is to find the relation between v and m such that z = y with sufficiently high probability (say, 99%) for most states |φ(b, x) (say, 99% of all such states). In doing so, we show the following Theorem 4.2. At least 99% of all |φ(b, x) states can be correctly decoded with probability 99%, whenever v = Ω(m 2 log m).
Proof. Without loss of generality, we focus on the case of g i < q 2 , that is y i = 0. Recall that It should be clear that for the very special case g i = 0, the probability of having the correct measurement outcome is 1 2 . In this case, it is impossible to tell if z i should be 0 or 1 even with repetition, because no matter how large v is, there will always be an equal number of correctly and incorrectly decoded bits, on average. Therefore, any component g i that is extremely close to 0 or q 2 so that p cor (φ i ) is close to 1 2 would make the whole |φ(b, x) state undecodable 11 . To be more explicit, we will consider |φ i to be undecodable whenever we have that either |g i | < q cm or |g i − q/2| < q cm , for a constant c > 0 to be determined later. But as noted before, for a uniform A, each g i (excluding the case g(0, 0)) is also uniform in Z q . It follows that the probability that g i leads to an undecodable |φ i is at most 1 q 4q cm = 4 cm , over the choice of A. From a union bound, we then also have that the probability of |φ(b, x) to be undecodable (i.e. at least one of its components is undecodable) is at most m 4 cm = 4 c . This means that at least a fraction 1 − 4 c of all |φ(b, x) states are, in fact, decodable. That is, all of their components are at least q cm away from the undecodability boundary. By taking c = 400, we have that 99% of |φ(b, x) are decodable. Without loss of generality, let's now consider a state that is barely decodable, with say g i = q cm . The probability of correctly decoding the corresponding |φ i state will be .
(40) 11 In fact, even if we ignore the cases where pcor(φ i ) = 1 2 , it is still required to have v = O(q) to distinguish between φ i = 2π q − π 2 and φ i = − 2π q + 3π The state is biased away from 1/2 by 1/O(m). From an application of the Chernoff-Hoeffding bound 12 it follows that repeating the measurement Ω(m 2 ) times and taking a majority vote is enough to ensure that the value is correctly decoded with constant probability (say 99%). Of course, we want that all m values are correctly decoded which means that we should take the number of repetitions v so that the probability of correctly decoding one value is at least 1 − 1/O(m).
Once again, we can use Chernoff-Hoeffding and find that v = Ω(m 2 log m). As the probability of incorrectly decoding one value is now 1/O(m), from a union bound the probability of incorrectly decoding any of the m values is O(1). By suitably choosing the constant factors, we can set this probability to be, say 1%. We therefore have that v = Ω(m 2 log m) = Ω(n 2 log m log 2 q) = Ω(n 4 log n).

Phase encoding for general p
The analysis from the previous subsections was concerned with the case p = 2. We now adapt this to the general case of p = O( √ mn log q). As we expect p to be a power of 2, the rounding g i p for any value of g i is exactly a (log 2 p)-bit number. What we have been doing so far with the phase encoding is to encode the most significant bit of f i = g i p in phase. What about the other log 2 p − 1 bits? The solution is simply to phase encode those bits as well.
Lemma 4.4. Applying the phase encoding to the log 2 p significant bits of every g i ∈ Z q , leads to a repetition factor v = Ω(n 4 log 2 n) in order to achieve the same guarantees as Theorem 4.2.
Proof. Specifically, the k'th significant bit of g i can be encoded as How does this affect the decodability results of the previous sections? The expected probability of decoding a single bit, without repetition, will still be negligibly close to 0.82. This is because, as we saw in Subsection 4.2.1, the deviation from this expectation is inverse in the square of the field size, which is now ∼ q 2 k . As k ≤ log 2 p, p = O( √ mn log q) so that 2 k = O( √ mn log q) and q = 2 O(n) , the deviation from the expected value of 0.82 remains negligible in n (or λ).
The decodability boundary, from Subsection 4.2.2, also changes from q cm to q 2 k cm . As 2 k = O( √ mn log q) and m = Ω(n log q), the boundary becomes q c n 4 , for some constant c > 0. Following the same steps as in Subsection 4.2.2, to ensure that most states can be correctly decoded, we see that the number of repetitions remains Ω(n 4 ). But this is just for the m-bit vector containing the k'th most significant bit of each component. As we have log 2 p such vectors, and we want all of them to be decoded correctly, we need to add an additional log 2 p factor so that overall we have v = Ω(n 4 log n log 2 p) = Ω(n 4 log 2 n).
Thus, for each b and x, the state the prover will prepare is

Constant-depth circuit implementation
Here we show that the phase encoding construction can be performed in constant quantum depth.
12 Each measurement is viewed as an i.i.d. random variable. The empirical mean of these variables is expected to be close to 1/2 + 1/O(m). Chernoff-Hoeffding tells us that a deviation of from this expected value occurs with probability exp(−v 2 ). Thus, since the case of interest is = 1/O(m), we can see that to have a constant probability of incorrectly decoding, it must be that v = Ω(m 2 ). Proof. We've already mentioned that cat states can be prepared in constant quantum depth with one quantum-classical interleaving. Let us then assume that we have sufficient cat states (of a size that will be determined later) and see how we can apply the required phases in constant quantum depth.
Recall that g(b, x) = Ax + b · (As + e), and determines the phase 13 φ i = 2πgi q − π 2 . The phase can then be expressed as Note that φ i only depends on b and not on x. Having multiple copies of b, we can easily apply a φ i rotation in parallel using Z-rotations (R z ) and controlled-Z-rotations (CR z ): The corresponding circuit is shown in Figure 4.
Figure 4: The quantum circuit for the vector addition operations in phase encoding. Here X0 is the first qubit of the X register that stores information of b. Zi,j is the j'th qubit of the i'th cat state which stores information of φi.
We now need to implement the phase-encoded matrix-vector multiplication in parallel on the cat state. Note that x j is a non-negative integer less than q and it can be expanded as denoting the k'th significant bit of x j by x j,k . The phase can be further expanded: Therefore, the desired phase can be applied to the cat state by parallel controlled-Z-rotation gates in constant-quantum depth. Specifically, where the CR z gates can be performed in parallel if the size of cat is Ω(n log q) = Ω(n 2 ). The local quantum circuit for multiplying A i,j with the k'th significant bit of x j is shown in Figure 5. Thus, all operations can be performed in constant quantum depth.
It is worth noting that in current physical realizations of quantum computers, these (controlled) rotations can be performed directly by tuning microwave frequencies for superconducting qubits [Wen17] or laser frequencies for trapped-ions [BCMS19]. Alternatively, if one insists on having a fixed-size gate set, [HŠ05] provides a constant-depth implementation with 1/poly error which is also acceptable.
The Hadamard measurements discussed in the previous sections are performed by measuring X on each qubit of a phase encoded cat state and then taking the parity of the outcomes.

LWR-based protocol with phase encoding
The protocol using the LWR-based NTCF and the phase encoding is outlined in Figure 6. The verifier behaves essentially the same as in the BCMVV protocol. The major difference is in the prover's honest strategy, which requires it to perform the constant-depth evaluation of the phase encoding.
As we saw in the previous subsections, due to the randomness over the choice of A and the probabilistic nature of the measurements, the protocol is not perfectly complete. That is, the success probability for the honest prover is no longer 100% as in the original BCMVV protocol. Before accounting for all sources of "imperfections" we first need to examine the post-measurement state in the preimage register after the prover performs step 2 in the protocol. Ideally, we would like this state to be as close as possible to an equal superposition over valid preimages. Thus, in the next subsection we compute a bound on the fidelity of the true state with respect to an ideal state.

Fidelity of the post-measurement state and the success probability for an honest prover
We wish to determine the success probability of an honest prover in the protocol. To do so, we need to characterize the prover's state after it measures the phase-encoded image register. We will show that the state in the preimage register (post-measurement of the phase-encoded image register) has high overlap with the "ideal" preimage state that would have be obtained if the prover performed the evaluation in the computational basis, rather than in phase. With this result, we can then compute the protocol's completeness in the next subsection.
To start the proof we will consider splitting the prover's measurement of the image register into two steps. First, the prover measures in the Hadamard basis all but one qubit from each phase encoded state in the image register. Then, it measures the remaining unmeasured qubits as well. This separation is fictitious, as in the protocol the prover will measure all qubits of the

Modified BCMVV protocol
Let λ = n be a security parameter and N ≥ 1 a number of rounds. The parties taking part in the protocol are a PPT machine, known as the verifier and a QPT machine, known as the prover. They will repeat the following steps N times: 1. The verifier generates (k, t k ) ← Gen(1 λ ). It sends k to the prover.
2. The prover uses k to implement the phase encoding of the function g k (b, x), and prepare the following state: The prover then measures the Z register in Hadamard basis. By conducting majority votes for the parities of the Hadamard measurement outcome of every block |0 + e iφ i,k |1 ⊗v , the prover obtains a new string y ∈ {0, 1} m log 2 p which it sends to the verifier.
The remaining state is |ψy =  image register in one step. But performing this separation and considering the prover's state after it measures all but one qubit of each phase encoded state will make the analysis simpler. Let us begin with the honest prover's state after performing the coherent evaluation of the function in phase, where, as before, Also recall that each component |φ i,k has the form of a rotated cat state The prover will measure each qubit of such a state (or, more precisely, of the coherent superposition of such states) in the Hadamard basis. It should be clear that when measuring all but one qubit in the Hadamard basis, the state of that qubit becomes where the ± relative phase is determined by the parity of the Hadamard basis measurement outcomes. Without loss of generality, let us fix 14 this phase as +.
We now rewrite each component |φ i,k as The substring with index i, k represents the measurement outcomes of |φ i,k ⊗v . We can then write the state as wherez i,k,j denotes the j'th bit of the substringz i,k , and α(z i,k |φ i,k , v) is the product of the pure phases α(z i,k,j |φ i,k ) with j ranging from 1 up to v. The entire phase encoded state |φ(b, x) can then be expressed as: Finally, the state of the coherent phase encoding evaluation in Equation 51 (but after the prover has measured all but one qubit of each phase-encoded cat state) can be expressed as well: Recall that we aim to estimate the success probability of an honest prover. To do so, we can first find an ideal state such that, if the prover holds that state, it would very likely succeed in the protocol. The success probability can therefore be estimated by evaluating the fidelity between the real and the ideal states, then evaluating the success probability if the prover holds the ideal state. Denoting the ideal state by |ψ ideal and the procedure of majority voting by Maj 15 , we let where c is a normalization constant, x 0 and x 1 : x 1 ). It should be clear why |ψ ideal is considered ideal, since the state in the BX register conditioned on having measured Z, will be a superposition of the claw ((0, x 0 ), (1, x 1 )). This is due to the fact that Maj(z) = f (0, x 0 ) which ensures that the image f (0, x 0 ) can be perfectly decoded. Hence, only the claw ((0, x 0 ), (1, x 1 )) will be consistent with this outcome of the image register. We now show the following: Proof. Let us first give a lower bound of c, where recall that c is the normalization constant in Equation 59. We showed in Theorem 4.2 that at least 99% of |φ 's are decodable. In other words, we have and for at least 99% possible x 0 's. Keeping in mind that f (0, x 0 ) = f (1, x 1 ), the normalization condition leads to c 2 2q n x0∈Z n q Maj(z)=f (0,x0) (Pr M (z| |φ(0, x 0 ) ) + Pr M (z| |φ(1, x 1 ) )) = 1, which implies that 1 < c 2 ≤ 2q n 0.99 · (0.99 + 0.99)q n = 1.02.
The fidelity can be computed as In the ideal state, everyz measurement outcome corresponds to exactly two |φ(b, x) states that form a claw of f . Supposing a specificz is measured, the remaining post-measurement state in the BX register will be Recall that the honest prover would certainly succeed in the protocol with an equal superposition over the claw (without any relative phase between the components): Unfortunately, the state in Equation 65, resulting from the measurement of |ψ ideal , is not of this form due to the presence of the phases α (z|φ(b,x b )) which could lead to a non-negligible relative phase. We now show that this relative phase is in fact close to zero. To do so, consider a "more ideal state" |ψ ideal,2 : where c ∈ R is another normalization factor. Note that in this state the two components corresponding to the preimage register share the same phase, α(z|φ(0, x 0 )), meaning that there is no relative phase. We start by bounding the normalization constant c from the norm of the state: 2q n 0.99q n · (0.99 + 0.99) .
It should be clear that if the prover holds |ψ ideal,2 , it would succeed in the equation and preimage tests with 100% probability. Thus, to calculate the success probability of the real prover in our protocol, we simply evaluate the fidelity between |ψ ideal and |ψ ideal,2 .

(71)
The inner product φ(1, x 1 )|φ(0, x 0 ) can also be evaluated by considering their phase encoded form. We start with As both states are phase encodings, the inner product will be determined by the angle differences between the components. In other words, letting and noting that g i (0, x) = (Ax) i and g i (1, x − s) = (Ax) i + e i , it is the case that We can now express the inner product as But now note that e ∞ ≤ cq p 5 , for some constant c > 0, as per Definition 4.1. If we substitute this into the formula for ∆φ i,k , keeping in mind that 2 k ≤ p, we find that Taking n to be sufficiently large, so that p is sufficiently large, leads to and i,k But now p 8 = O (mn log q) 4 = O(n 16 ) and mv log 2 p = O(n 2 · n 4 log n · log n) = O(n 6 log 2 n). It follows that For the phase part and similarly Finally, and the fidelity can be lower-bounded as follows | ψ ideal |ψ ideal,2 | 2 ≥ 1 2q n 0.99q n · 0.99 + x0 1 − 1 poly(n) for large sufficiently n.
Combining Lemmas 4.5 and 4.6, we conclude that the success probability for an honest prover is lower bounded by 0.95, using a union bound.

Completeness
We can now compute the probability for an honest prover, following the strategy outlined in Figure 6, to pass the verifier's checks. We start with the observation that q is prime. As mentioned, this would require the prover to create a superposition in the preimage register of q n components. Instead, the prover creates a superposition of q n components, where q is a power of 2 that is close to q. From the results in Subsection 4.1.1, we incur a O(n −1 ) penalty in the honest prover's success probability as a result of this. Next, we saw that when performing the measurement of the image register, there is a chance that the |φ(b, x) state contains components that are undecodable. We limited the probability of this happening to 1%, with the parameter choices mentioned in Subsection 4.2.2. Assuming the state is decodable, we saw that the probability of incorrectly decoding is also 1%. With these results, we showed in Subsection 4.3.1 that the prover's state, upon measuring the image register (and successfully decoding the result, which is sent to the verifier), gives it at least a 95% success probability in the equation and preimage tests. This also accounted for the failure probability of incorrectly decoding the image register. Finally, as discussed in Subsection 4.2.4, if we choose to use a fixed-size gate set, we will incur another 1/poly(n) error.
Putting everything together, we find that the overall completeness of the protocol is 95% − O(n −1 ).

Soundness
Since we showed that the LWR-based function f (b, x) is also an NTCF, in Subsection 4.1, our new constant quantum depth protocol inherits the soundness of the original BCMVV protocol.

Resource estimation
As in Subsection 3.3, we summarize the resources required for an honest prover to succeed in the protocol.

Quantum depth and quantum-classical interleavings
1. Preparation of cat states. Same as in the randomized encoding construction, the depth of this step is 5 and the prover interleaves constant-depth quantum computation and classical log-depth computation once.
2. Evaluation of the LWR function by phase encoding. As is illustrated in Figure 5, this step consists of only parallel CR z gates or R z ( π 2 ) gates. The depth added is only 1 for the example case in Figure 5.
3. Measurement of the Z register. As is explained in Subsection 4.2.2, the measurement of the Z register contains Hadamard measurements and a majority vote (performed classically on the measurement outcome), hence this step has quantum depth 2 and adds 1 step of quantum-classical interleaving.

Preimage test/equation test.
Exactly the same as in the BCMVV protocol, this step requires at most depth 2 and 1 interleaving for the equation test.
In summary, the phase encoding construction requires even shorter quantum depth than the generic construction, as the overall quantum depth is 5 + 1 + 2 + 2 = 10. The number of quantum-classical interleaving is 3, same as the generic construction.

Circuit width
The total width of the circuit is determined by the product of several multipliers in the protocol: In summary, the total circuit width required is O(n 8 log 3 n). Although this is still a high-order polynomial, it is a significant improvement over the randomized encoding construction (where we estimated O(n 33 ) width). Note that the normal, poly-depth, construction requires O(m log q) = O(n 3 ) width. It is also worth mentioning that there can be a trade-off between the size of the cat states and the depth of the circuit, since the matrix multiplication does not need to be fully parallelized. In practice, one can double the number of CR z gates applied on each qubit to halve the width.

Robustness against noise
Another feature of our phase encoding construction is some amount of intrinsic robustness against noise, which makes it closer to practical use on near-term devices.
The key reasons for the noise-resistance are the use of cat states, the classical repetition code we applied in measuring the Z register, as is discussed in Subsection 4.2.2, the error-correcting properties of the LWR construction which we used implicitly in Subsection 4.3.1 and the constant gap between the best quantum strategy and the best classical strategy (assuming intractability of LWE) as encapsulated by Inequality 17.
We can therefore see that errors on the image register, Z, may lead to bit flips of the output string z such that z = y (where recall that y is the ideal decoding). However, since any bit z i is determined by majority voting for all v repetitions of the phase encoding of that bit, the probability that z i is flipped is much smaller than that of single bit flipping. Intuitively speaking, some correctly measured bits may be flipped due to noise that might appear in any stage of the protocol, but incorrect bits are equally likely to be flipped. Hence the majority vote will still very likely output z i = y i .
A repetition code is also used indirectly in the preimage register, as the preimages are encoded in cat states. While this makes the preimage test robust to noise, the equation test will not be, in general. This is because in the equation test, the prover needs to report a string d and a bit b such that d · (x 0 ⊕x 1 ) = b (83) wherex 0 andx 1 are the repetition code encodings of preimages x 0 and x 1 (that match the image the prover returned in the previous round of the protocol). In this case we can see that even a single bit flip in either the string d or of the bit b can make the equation invalid. We therefore leave it as an open problem to find a fully noise-robust implementation of the protocol.
Definition A.1 (Branching programs [AIK04]). A branching program (BP) is defined by a tuple BP = (G, φ, s, t) where G = (V, E) is a directed acyclic graph, φ is a labeling function assigning each edge either a positive literal x i or a negative literal ¬x i . An input binary vector w determines a subgraph G w where an edge labeled as x i is preserved if and only if w i = 1. In a (counting) mod-2 BP, the BP computes the number of paths from s to t modulo 2. The size, l, of a BP is defined as the number of vertices, |V |.
As an example, Figure 7 shows a mod-2 branching program of size l = 4 and having three inputs x = (x 0 , x 1 , x 2 ). s t 1 2 1 x 0 x 1 ¬ x 1 x2 Figure 7: This size-4 mod-2 branching program consists of 5 edges whose connectivity is decided by the value of the input bits. Note that ¬x1 means that this edge is available if and only if x1 = 0. As an example, when the input x = (x0, x1, x2) = (0, 1, 1), there is only one path from s to t which is s − 1 − 2 − t . Thus the output of this mod-2 BP will be 1.
We now state one of the most important results concerning branching programs, due to Barrington: Theorem A.1 (Barrington's theorem [Bar89]). If f : {0, 1} n → {0, 1} can be computed by a circuit of depth d, then it can be computed by a branching program of width 5 and length O(4 d ).
The above theorem ensures that the log-depth (N)TCFs used in proof of quantumness protocols can be transformed into polynomial-size branching program. Given that branching programs output a single bit, this construction has to be performed for every output bit of a (N)TCF.
A size-l mod-2 BP for a binary function f can be represented by an adjacency matrix since BPs are directed acyclic graphs. Let A(x) denote the l × l adjacency matrix of a BP with input x. We also denote as L(x) the (l − 1) × (l − 1) submatrix of A(x) − I obtained by deleting the first column and the last row. It turns out that the following fact holds: This lemma is the basis for constructing a randomized encoding for f . The goal will be to "garble" L(x) through products with certain random matrices. The garbling should be done in such a way that the determinant of the resulting matrix matches that of L(x), thus preserving the correctness of the construction.
To that end, let r (1) ← R {0, 1} ( l−1 2 ) and r (2) ← R {0, 1} l−2 . Use these to construct matrices R (1) and R (2) of dimensions (l − 1) × (l − 1). Both matrices have all diagonal elements equal to 1. The right upper-diagonal elements of R (1) (that is, the entries R (1) i,j with j > i) are filled with the entries of r (1) . The last column of R (2) , except for the last element, (that is, the entries R (2) i,l−1 , 1 ≤ i ≤ l − 2) is filled with the elements of r (2) . All other entries of R (1) and R (2) are 0. The following can be shown: This is not too difficult to see, as both R (1) and R (2) have determinant 1. One now defines the randomized encodingf (x, r (1) , r (2) ) = R (1) L(x)R (2) . It follows that: ).f is a perfect randomized encoding of f . By construction, every entry off is a degree-3 polynomial in its input variables. However, computing this function (i.e. computing every matrix entry of R (1) L(x)R (2) ) cannot be done in constant-depth. The reason is that some of the input variables are involved in a linear number of monomials of the output. To compute the function in constant depth, it must be that each input variable appears in only a constant number of monomials. The authors of [AIK04] remedy this by considering a randomized encoding forf . Before doing so, note that Lemma A.4 ( [AIK04]). The composition of perfect randomized encodings is still a perfect randomized encoding of the original function.

Lemma A.5 ([AIK04]
).f is a perfect randomized encoding of f with output locality 4.
Here, output locality 4 means that each output bit depends on at most 4 input bits, which immediately implies that the function can be evaluated in constant depth. The classical circuit computing an entry off is shown in Figure 8. Detailed proofs of all these results can be found in [AIK04].  (2) k xored with rm. For the entries with m > k, note that a single XOR gate is required.

B Reconstruction of randomness
In our first constant quantum-depth proof of quantumness, the prover is instructed to evaluate a randomized encoding of a TCF. The verifier must still be able to use the trapdoor in order to invert an output of the randomized encoding. As mentioned in Subsection 3.2, this is true provided the encoding satisfies the randomness reconstruction property. Here we prove this fact for the construction of [AIK04].
Proof of Lemma 2.2. We would like to show that given an instance off i,j (x, r (1) , r (2) , r, r ), as shown in Equation 85, as well as x, it is possible to efficiently recover the randomness r (1) , r (2) , r, r . First note that if the terms T k were known as well as r (1) , r (2) , it is straightforward to recover r and r . We will therefore focus on that case. From Equation 85 it is possible to efficiently compute the result of Equation 84, sincef is a randomized encoding off : simply xor all the terms in Equation 85. We will then focus on randomness reconstruction forf as that will then yield randomness reconstruction forf .
Denote as M =f (x, r (1) , r (2) ) = R (1) L(x)R (2) . Given M and x we wish to recover r (1) , r (2) . This boils down to solving a specific quadratic system of equations. To see why, take l = 4 as an example, 2 . Plugging these values into the second diagonal (the one above the main diagonal), yields another system of linear equations with an equal number of unknowns. By repeating the process and solving all of these systems, all bits in r (1) and r (2) are recovered.
We now show that this strategy works for arbitrary l. Start by observing that: i,j = 1, i = j R (2) i,j = 0, (i > j) ∨ (i < j < l − 2). The entries of M can then be expressed as: Consider the entries on the main diagonal, excluding the last element: with i < l − 2 and where R (1) i,i+1 are the elements of the second diagonal of R (1) and the L i,i 's are already known (as they only involve entries of x). This gives us a simple linear system which we can solve to recover the R From this we also recover R (2) l−3,l−2 , i.e. the last entry in r (2) . Note that the unknowns here consisted of the entries in the second diagonal of R (1) and the last element of r (2) . This matches the number of equations and so all values could be recovered.
We now claim that the k'th diagonal of M is a linear system which depends only on the k + 1 diagonal of R (1) and the k'th last element of r (2) given the solutions to the previous k − 1 diagonals of M . Writing out the elements, we have: with j = k − 1. For i + j = l − 2: where the first term is from the (k + 1)'th diagonal of R (1) and the remaining terms are known from solving the equations for the previous diagonals. Thus, we have a linear system, which we can solve, with unknowns comprising the elements of the (k + 1)'th diagonal of R (1) .
The first term is a linear combination of the last k + 1 entries of R (2) , i.e. the last k elements of r (2) , and only the k'th element is unknown. The remaining terms are known from solving the systems corresponding to the previous diagonals. We can therefore proceed in this fashion, starting from the first diagonal of M and going upwards solving all systems of linear equations and thus recovering all values of r (1) and r (2) . This procedure is clearly efficient and we have shown that it is also correct. To conclude the proof, we also need to make sure that there is a unique solution to the system. This is guaranteed by the unique randomness property of the randomized encoding (Theorem 2.5).