Towards Quantum One-Time Memories from Stateless Hardware

A central tenet of theoretical cryptography is the study of the minimal assumptions required to implement a given cryptographic primitive. One such primitive is the one-time memory (OTM), introduced by Goldwasser, Kalai, and Rothblum [CRYPTO 2008], which is a classical functionality modeled after a non-interactive 1-out-of-2 oblivious transfer, and which is complete for one-time classical and quantum programs. It is known that secure OTMs do not exist in the standard model in both the classical and quantum settings. Here, we propose a scheme for using quantum information, together with the assumption of stateless (i.e., reusable) hardware tokens, to build statistically secure OTMs. Via the semidefinite programming-based quantum games framework of Gutoski and Watrous [STOC 2007], we prove security for a malicious receiver making at most 0.114n adaptive queries to the token (for n the key size), in the quantum universal composability framework, but leave open the question of security against a polynomial amount of queries. Compared to alternative schemes derived from the literature on quantum money, our scheme is technologically simple since it is of the “prepare-and-measure” type. We also give two impossibility results showing certain assumptions in our scheme cannot be relaxed.


Introduction
Theoretical cryptography centers around building cryptographic primitives secure against adversarial attacks. In order to allow a broader set of such primitives to be implemented, one often considers restricting the power of the adversary. For example, one can limit the computing power of adversaries to be polynomial bounded [Yao82;BM82], restrict the storage of adversaries to be bounded or noisy [Mau92;CM97;Dam+05], or make trusted setups available to honest players [Kil88; BFM88; Can01; Can+02; IPS08; PR08; LPV09; MPR09; MPR10; MR11; KMQ11;Kra+14], to name a few. One well-known trusted setup is tamper-proof hardware [Kat07;GKR08], which is assumed to provide a specific inputoutput functionality, and which can only be accessed in a "black box" fashion. The hardware can maintain a state (i.e., is stateful ) and possibly carry out complex functionality, but presumably may be difficult or expensive to implement or manufacture. This leads to an interesting research direction: Building cryptography primitives using the simplest (and hence easiest and cheapest to manufacture) hardware.
In this respect, two distinct simplified notions of hardware have captured considerable interest. The first is the notion of a one-time memory (OTM) [GKR08], which is arguably the simplest possible notion of stateful hardware. An OTM, modeled after a non-interactive 1-out-of-2 oblivious transfer, behaves as follows: first, a player (called the sender ) embeds two values s 0 and s 1 into the OTM, and then gives the OTM to another player (called the receiver ). The receiver can now read his choice of precisely one of s 0 or s 1 ; after this "use" of the OTM, however, the unread bit is lost forever. Interestingly, OTMs are complete for implementing one-time use programs (OTPs): given access to OTMs, one can handling multiple challenges in a unified fashion: arbitrary quantum operations by the user, classical queries to the token, and the highly non-trivial assumption of quantum side information for the user (the "quantum key" state sent to the user.) Towards security against polynomially many queries. Regarding the prospects of proving security against polynomially many adaptive queries, we generally believe it requires a significant new insight into how to design a "good" feasible solution to the primal semidefinite program (SDP) obtained via GW. However, in addition to our proof for linear security (Theorem C.5), in Appendix D we attempt to give evidence towards our conjecture for polynomial security. Namely, Appendix D.1 first simplifies the SDPs obtained from GW, and derives the corresponding dual SDPs. We remark these derivations apply for any instantiation of the GW framework, i.e. they are not specific to our setting, and hence may prove useful elsewhere. In Appendix D.2, we then give a feasible solution Y (Equation (128)) to the dual SDP. While Y is simple to state, it is somewhat involved to analyze. A heuristic analysis suggests Y 's dual objective function value has roughly the behavior needed to show security, i.e. the value scales as m/ √ 2 n , for m queries and n key bits. If Y were to be the optimal solution to the dual SDP, this would strongly suggest the optimal cheating probability is essentially m/ √ 2 n . However, we explicitly show Y is not optimal, and so m/ √ 2 n is only a lower bound on the optimal cheating probability 2 . Nevertheless, we conjecture that while Y is not optimal, it is approximately optimal (see Conjecture D.2 for a precise statement); this would imply the desired polynomial security claim. Unfortunately, the only techniques we are aware of to show such approximate optimality involve deriving a better primal SDP solution, which appears challenging.
Further Related work. Our work contributes to the growing list of functionalities achievable with quantum information, yet unachievable classically. This includes: unconditionally secure key expansion [BB84], physically uncloneable money [Wie83; MVW13; Pas+12], a reduction from oblivious transfer to bit commitment [Ben+92;Dam+09] and to other primitives such as "cut-and choose" functionality [Feh+13], and revocable time-release quantum encryption [Unr14]. Importantly, these protocols all make use of the technique of conjugate coding [Wie83], which is also an important technique used in protocols for OT in the bounded quantum storage and noisy quantum storage models [Dam+05;WST08] (see [BS16] for a survey).
A number of proof techniques have been developed in the context of conjugate coding, including entropic uncertainty relations [WW10]. In the context of QKD, another technique is the use of de Finetti reductions [Ren08] (which exploit the symmetry of the scheme in order to simplify the analysis). Recently, semidefinite programming (SDP) approaches have been applied to analyze security of conjugate coding [MVW13] for quantum money, in the setting of one round of interaction with a "stateful" bank. SDPs are also the technical tool we adopt for our proof (Section 3.4 and Appendix C), though here we require the more advanced quantum games SDP framework of Gutoski and Watrous [GW07] to deal with multiple adaptive interactions with stateless tokens. Reference [Pas+12] has also made use of Gavinsky's [Gav12] quantum retrieval games framework.
Continuing with proof techniques, somewhat similar to [Pas+12], Aaronson and Christiano [AC12] have studied quantum money schemes in which one interacts with a verifier. They introduce an "inner product adversary method" to lower bound the number of queries required to break their scheme.
We remark that [Pas+12] and [MVW13] have studied schemes based on conjugate coding similar to ours, but in the context of quantum money. In contrast to our setting, the schemes of [Pas+12] and [MVW13] (for example) involve dynamically chosen random challenges from a verifier to the holder of a "quantum banknote", whereas in our work here the "challenges" are fixed (i.e., measure all qubits in the Z or X basis to obtain secret bit s 0 or s 1 , respectively), and the verifier is replaced by a stateless token. Thus, [MVW13], for example, may be viewed as using a "stateful" verifier, whereas our focus here is on a "stateless" verifier (i.e., a token).
Also, we note that prior work has achieved oblivious transfer using quantum information, together with some assumption (e.g., bit commitment [Ben+92] or bounded quantum storage [Dam+05]). These protocols typically use an interaction phase similar to the "commit-and-open" protocol of [Ben+92]; because we are working in the non-interactive setting, these techniques appear to be inapplicable.
Finally, Liu [Liu14a;Liu14b;Liu15] has given stand-alone secure OTMs using quantum information in the isolated-qubit model. Liu's approach is nice in that it avoids the use of trusted setups. In return, however, Liu must use the isolated-qubit model, which restricts the adversary to perform only singlequbit operations (no entangling gates are permitted); this restriction is, in some sense, necessary if one wants to avoid trusted setups, as a secure OTM in the plain quantum model cannot exist (see Section 4). In contrast, in the current work we allow unbounded and unrestricted quantum adversaries, but as a result require a trusted setup. In addition, we remark the security notion of OTMs of [Liu14a; Liu14b;Liu15] is weaker than the simulation-based notion studied in this paper, and it remains an interesting open question whether the type of OTM in [Liu14a;Liu14b;Liu15] is secure under composition (in the current work, the UC framework gives us security under composition for free).
Significance. Our results show a strong separation between the classical and quantum settings, since classically, stateless tokens cannot be used to securely implement OTMs. To the best of our knowledge, our work is the first to combine conjugate coding with stateless hardware tokens. Moreover, while our protocol shares similarities with prior work in the setting of quantum money, building OTMs appears to be a new focus here 3 .
Our protocol has a simple implementation, fitting into the single-qubit prepare-and-measure paradigm, which is widely used as the "benchmark" for a "physically feasible" quantum protocol (in this model, one needs only the ability to prepares single-qubit states |0 , |1 , |+ , |− , and to perform single-qubit projective measurements. In particular, no entangled states are required, and in principle no quantum memory is required, since qubits can be measured one-by-one as they arrive). In addition, from a theoretical cryptographic perspective, our protocol is attractive in that its implementation requires an assumption of a stateless hardware token, which is conceivably easier and cheaper to manufacture (e.g. analogous to an RFID tag) than a stateful token.
In terms of security guarantees, we allow arbitrary operations on behalf of a malicious quantum receiver in our protocol (i.e., all operations allowed by quantum mechanics), with the adversary restricted in that the stateless token is assumed only usable as a black box. The security we obtain is statistical, with the only computational assumption being on the number of queries made to the token (recall we show security for a linear number of queries, and conjecture security for polynomially many queries). Finally, our security analysis is in the quantum UC framework against a corrupted receiver; this means our protocol can be easily composed with many others; for example, combining our results with [BGS13]'s protocol immediately yields UC-secure quantum OTPs against a dishonest receiver.
We close by remarking that our scheme is "tight" with respect to two impossibility results, both of which assume the adversary has black-box access to both the token and its inverse operation 4 . First, the assumption that the token be queried only in the computational basis cannot be relaxed: Section 4.1 shows that if the token can be queried in superposition, then an adversary in our setting can easily break any OTM scheme. Second, our scheme has the property that corresponding to each secret bit s i held by the token, there are exponentially many valid keys one can input to the token to extract s i . In Section 4.2, we show that for any "measure-and-access" OTM (i.e., an OTM in which one measures a given quantum key and uses the classical measurement result to access a token to extract data, of which our protocol is an example 5 ), a polynomial number of keys implies the ability to break the scheme with inverse polynomial probability (more generally, ∆ keys allows probability at least 1/∆ 2 of breaking the scheme).
Open Questions. While our work shows the fundamental advantage that quantum information yields in a stateful to stateless reduction, it does leave a number of open questions: 1. Security against polynomially many queries. Can our security proof be strengthened to show information theoretic security against a polynomial number of queries to the token? We conjecture this to be the case, but finding a formal proof has been elusive. (See discussion under "Towards security against polynomially many adaptive queries" above for details.) 2. Composable security against a malicious sender. While we show composable security against a malicious receiver, our protocol can achieve standalone security against a malicious sender. Could an adaptation of our protocol ensure composable security against a malicious sender as well? 6 3. Non-reversible token. Our impossibility result for quantum one-time memories with quantum queries (Section 4) assumes the adversary has access to reversible tokens; can a similar impossibility result be shown for non-reversible tokens? In Section 4, we briefly discuss why it may be difficult to extend the techniques of our impossibility results straightforwardly when the adversary does not have access to the inverse of the token.
4. Imperfect devices. While our prepare-and-measure scheme is technologically simple, it is still virtually unrealizable with current technology, due to the requirement of perfect quantum measurements. We leave open the question of tolerance to a small amount of noise.
Organization. We begin in Section 2 with preliminaries, including the ideal functionalities for an OTM and stateless token, background on quantum channels, semidefinite programming, and the Gutoski-Watrous framework for quantum games. In Section 3, we give our construction for an OTM based on a stateless hardware token; the proof ideas for security are also provided. In Section 4, we discuss "tightness" of our construction by showing two impossibility results for "relaxations" of our scheme. In the Appendix, we include the description of classical UC and quantum UC (Appendix A); Appendix B establishes notation required in the definition of stand-alone security against a malicious sender. Appendix C gives our formal security proof against a linear number of queries to the token; these results are used to finish the security proof in Section 3. Appendix D gives a simplification of the GW SDP, derives its dual, and gives a dual feasible solution which we conjecture to be approximately optimal (formally stated in Conjecture D.2). Finally, the security proof for a lemma in Section 4 can be found in Appendix E.
Quantum universal composition (UC) framework. We consider simulation-based security in this paper. In particular, we prove the security of our construction against a malicious receiver in the quantum universal composition (UC) framework [Unr10]. Please see Appendix A for a brief description of the classical UC [Can01] and the quantum UC [Unr10]. In the next two paragraphs, we introduce two relevant ideal functionalities of one-time memory and of stateless hardware token.
One-time memory (OTM). The one-time memory (OTM) functionality F OTM involves two parties, the sender and the receiver, and consists of two phases, "Create" and "Execute". Please see Functionality 1 below for details; for the sake of simplicity, we have omitted the session/party identifiers as they should be implicitly clear from the context. We sometimes refer to this functionality F OTM as an OTM token.
Stateless hardware. The original work of Katz [Kat07] introduces the ideal functionality F wrap to model stateful tokens in the UC-framework. In the ideal model, a party that wants to create a token, Functionality 1 Ideal functionality F OTM .
2. Execute: Upon input b ∈ {0, 1} from the receiver, send s b to receiver. Delete any trace of this instance.
sends a Turing machine to F wrap . F wrap will then run the machine (keeping the state), when the designated party will ask for it. The same functionality can be adapted to model stateless tokens. It is sufficient that the functionality does not keep the state between two executions. A simplified version of the F wrap functionality as shown in [CGS08] (that is very similar to the F wrap of [Kat07]) is described below. Note that, again for the sake of simplicity, we have omitted the session/party identifiers as they should be implicitly clear from the context.

Functionality 2
Ideal functionality F wrap . The functionality is parameterized by a polynomial p(·), and an implicit security parameter n. Although the environment and adversary are unbounded, we specify that stateless hardware can be queried only a polynomial number of times. This is necessary; otherwise the hardware token model is vacuous (with unbounded queries, the entire input-output behavior of stateless hardware can be extracted).
Quantum channels. We now review quantum channels. A basic background in quantum information is assumed, see e.g. [NC00] for a standard reference. A linear map Φ : L(X ) → L(Y) is a quantum channel if Φ is trace-preserving and completely positive (TPCP). Such maps take density operators to density operators. A useful representation of linear maps (or "superoperators") Φ : L(X ) → L(Y) is the Choi-Jamio lkowski representation, J(Φ) ∈ L(Y ⊗ X ). The latter is defined (with respect to some choice of orthonormal basis {|i } for X ) as J(Φ) = i,j Φ(|i j|) ⊗ |i j|. The following properties of J(Φ) hold [Cho75; Jam72]: (1) Φ is completely positive if and only if J(Φ) 0, and (2) Φ is trace-preserving if and only if Tr Y (J(Φ)) = I X . In a nutshell, the Gutoski-Watrous (GW) framework generalizes this definition to interacting strategies [GW07].
Semidefinite programs. We give a brief overview of semidefinite programs (SDPs) from the perspective of quantum information, as done e.g., in the notes of Watrous [Wat11] or [MVW13]. For further details, a standard text on convex optimization is Boyd and Vandenberghe [BV04].

Dual problem (D)
inf Tr(BY ) where Φ * denotes the adjoint of Φ, which is the unique map satisfying Tr(A † Φ(B)) = Tr((Φ * (A)) † B) for all A ∈ L(Y) and B ∈ L(X ). Not all SDPs have feasible solutions (i.e. a solution satisfying all constraints); in this case, we label the optimal values as −∞ for P and ∞ for D, respectively. Note also that the SDP we derive in Equation (66) will for simplicity not be written in precisely the form above, but can without loss of generality be made so. Figure 1: A general interaction between two quantum parties.

The Gutoski-Watrous framework for quantum games
We now recall the Gutoski-Watrous (GW) framework for quantum games [GW07], which can be used to model quantum interactions between spatially separated parties. The setup most relevant to our protocol here is depicted in Figure 1. Here, we imagine one party, A, prepares an initial state ρ 0 ∈ D(X 1 ⊗ W 0 ). Register X 1 is then sent to the second party (W 0 is kept as private memory), B, who applies some quantum channel Φ i : L(X 1 ) → L(Y 1 ⊗ Z 1 ). B keeps register Z 1 as private memory, and sends Y 1 back to A, who applies channel Ψ 1 : L(W 0 ⊗ Y 1 ) → L(X 2 ⊗ W 1 ), and sends X 2 to B. The protocol continues for m messages back and forth, until the final operation Ψ m : L(W m ⊗ Y m ) → C, in which A performs a two-outcome measurement (specifically, a POVM Λ = {Λ 0 , Λ 1 }, meaning Λ 0 , Λ 1 0, Λ 0 + Λ 1 = I) in order to decide whether to reject (Λ 0 ) or accept (Λ 1 ). As done in [GW07], we may assume without loss of generality 7 that all channels are given by linear isometries 8 A k , i.e. Φ k (X) = A k XA † k . Reference [GW07] refers to (Φ 1 , . . . , Φ m ) as a strategy and (ρ 0 , Ψ 1 , . . . , Ψ m ) as a co-strategy. In our setting, the former is "non-measuring", meaning it makes no final measurement after Φ m is applied, whereas the latter is "measuring", since we will apply a final measurement on space W m (not depicted in Figure 1).
Intuitively, since our protocol (Section 3.1) will begin with the token sending the user a quantum key |x θ , we will later model the token as a measuring co-strategy, and the user of the token as a strategy. The advantage to doing so is that the GW framework allows one to (recursively) characterize any such strategy (resp., co-strategy) via a set of linear (in)equalities and positive semi-definite constraints. (In this sense, the GW framework generalizes the Choi-Jamio lkowski representation for channels to a "Choi-Jamio lkowski" representation for strategies/co-strategies.) To state these constraints, we first write down the Choi-Jamio lkowski (CJ) representation of a strategy (resp., measuring co-strategy) from [GW07].
CJ representation of (non-measuring) strategy. The CJ representation of a strategy (A 1 , . . . , A m ) is given by matrix [GW07] Tr Zm (vec(A) vec(A) † ), where A ∈ L(X 1 ⊗ · · · ⊗ X m , Y 1 ⊗ · · · ⊗ Y m ⊗ Z m ) is defined as the product of the isometries A i , and the vec : L(S, T ) → T ⊗ S mapping is the linear extension of the map |i j| → |i |j defined on all standard basis states |i , |j .
CJ representation of (measuring) co-strategy. Let Λ := {Λ 0 , Λ 1 } denote a POVM with reject and accept measurement operators Λ 0 and Λ 1 , respectively. A measuring strategy which ends with a measurement with respect to POVM Λ replaces, for Λ a ∈ Λ, Equation (2) with [GW07] To convert this to a co-strategy, one takes the transpose of the operators defined above (with respect to the standard basis). (Note: In our use of the GW framework in Section C.1, all operators we derive will be symmetric with respect to the standard basis, and hence taking this transpose will be unnecessary.) Optimization characterization over strategies and co-strategies. With CJ representations for strategies and co-strategies in hand, one can formulate [GW07] the optimal probability with which a strategy can force a corresponding co-strategy to output a desired result as follows. Fix any Q a from a measuring co-strategy {Q 0 , Q 1 }, as in Equation (6). Then, Corollary 7 and Theorem 9 of [GW07] show that the maximum probability with which a (non-measuring) strategy can force the co-strategy to output result a is given by min: p (7) subject to: Q a pR m (8) Intuition. The minimum p denotes the optimal "success" probability, meaning the optimal probability of forcing the co-strategy to output a (Theorem 9 of [GW07]). The variables above, in addition to p, are {R i } and {P i }, where the optimization is happening over all m-round co-strategies R m satisfying Equation (8). How do we enforce that R m encodes such an m-round co-strategy? This is given by the (recursive) Equations (9)-(13). Specifically, Corollary 7 of [GW07] states that R m is a valid m-round co-strategy if and only if all of the following hold: (1) R m 0, (2) R m = P m ⊗ I Ym for P m 0 and Y m the last incoming message register to the co-strategy, (3) Tr Xm (P m ) is a valid m − 1 round co-strategy (this is the recursive part of the definition). An intuitive sense as to why conditions (2) and (3) should hold is as follows: For any m-round co-strategy R m , let R m−1 denote R m restricted to the first m − 1 rounds. Then, to operationally obtain R m−1 from R m , the co-strategy first ignores the last incoming message in register Y m . This is formalized via a partial trace over Y m , which (once pushed through the CJ formalism 9 ) translates into the ⊗I Y k term in Equation (9). Since the co-strategy is now ignoring the last incoming message Y m , any measurement it makes after m − 1 rounds is independent of the last outgoing message X m . Thus, we can trace out X m as well, obtaining a co-strategy R m−1 on just the first m − 1 rounds; this is captured by Equation (10).

Feasibility of Quantum OTMs using Stateless Hardware
In this section, we present a quantum construction for one-time memories by using stateless hardware (Section 3.1). We also state our main theorem (Theorem 3.1). In Section 3.3, we describe the Simulator and prove Theorem 3.1 using the technical results of Appendix C. The intuition and techniques behind the proofs in Appendix C are sketched in Section 3.4.

Construction
We now present the OTM protocol Π in the F wrap hybrid model, between a sender P s and a receiver P r .
Here the security parameter is n.
• The sender sends |x θ to the receiver.
• The sender sends (create, M ) to functionality F wrap , and the functionality sends create to notify the receiver.

Program 1 Program for hardware token
Hardcoded values: s 0 , s 1 ∈ {0, 1}, x ∈ {0, 1} n , and θ ∈ {+, ×} n Inputs: y ∈ {0, 1} n and b ∈ {0, 1}, where y is a claimed measured value for the quantum register, and b the evaluator's choice bit 1. If b = 0, check that the θ = + positions return the correct bits in y according to x. If Accept, output s 0 . Otherwise output ⊥. 2. If b = 1, check that the θ = × positions return the correct bits in y according to x. If Accept, output s 1 . Otherwise output ⊥.
The receiver P r operates as follows: Upon input b from the environment, and |x θ from the receiver, and create notification from F wrap , • If b = 0, measure |x θ in the computational basis to get string y. Input (run, (y, b)) into F wrap .
• If b = 1, apply H ⊗n to |x θ , then measure in the computational basis to get string y.
Return the output of F wrap to the environment. It is easy to see that the output of F wrap is s b for both b = 0 and b = 1.
Note again that the hardware token, as defined in Program 1, accepts only classical input (i.e., it cannot be queried in superposition). As mentioned earlier, relaxing this assumption yields impossibility of a secure OTM implementation (assuming the receiver also has access to the token's inverse operation), as shown in Section 4.

Stand-Alone Security Against a Malicious Sender
We note that in protocol Π of Section 3.1, once the sender prepares and sends the token, she is no longer involved (and in particular, the sender does not receive any further communication from the receiver). We call such a protocol a one-way protocol. Because of this simple structure, and because the ideal functionality F wrap also does not return any message to the sender, we can easily establish stand-alone security against a malicious sender (see details in Appendix B).

UC-Security against a corrupt receiver
Our main theorem, which establishes security against a corrupt receiver is now stated as follows.
Theorem 3.1. Construction Π above quantum-UC-realizes F OTM in the F wrap hybrid model with statistical security against an actively-corrupted receiver making at most cn number of adaptive queries to the token, for any fixed constant c < 0.114.
To prove Theorem 3.1, we must construct and analyze an appropriate simulator, which we now do.

The simulator
In order to prove Theorem 3.1, for an adversary A that corrupts the receiver, we build a simulator S (having access to the OTM functionality F OTM ), such that for any unbounded environment Z, the executions in the real model and that in simulation are statistically indistinguishable. Our simulator S is given below: The simulator emulates an internal copy of the adversary A who corrupts the receiver. The simulator emulates the communication between A and the external environment Z by forwarding the communication messages between A and Z.
The simulator S needs to emulate the whole view for the adversary A. First, the simulator picks dummy inputss 0 = 0 ands 1 = 0, and randomly chooses x ∈ {0, 1} n , and θ ∈ {+, ×} n , and generates programM . Then the simulator plays the role of the sender to send |x θ to the adversary A (who controls the corrupted receiver). The simulator also emulates F wrap to notify A by sending create to indicate that the hardware is ready for queries.
For each query (run, (b, y)) to F wrap from the adversary A, the simulator evaluates programM (that is created based ons 0 ,s 1 , x, θ) as in the construction, and then acts as follows: 1. If this is a rejecting input, output ⊥.
2. If this is the first accepting input, call the external F OTM with input b, and learn the output 3. If this is a subsequent accepting input, output s b (as above).

Analysis
We now show that the simulation and the real model execution are statistically indistinguishable. There are two cases in an execution of the simulation which we must consider: Case 1: In all its queries to F wrap , the accepting inputs of A have the same choice bit b. In this case, the simulation is perfectly indistinguishable. Case 2: In its queries to F wrap , A produces accepting inputs for both b = 0 and b = 1. In this case, it is possible that the simulation fails (the environment can distinguish the real model from the ideal model), since the simulator is only able to retrieve a single bit from the external OTM functionality F OTM (either corresponding to b = 0 or b = 1). Thus, whereas in Case 1 the simulator behaves perfectly, in Case 2 it is in trouble. Fortunately, in Theorem 3.2 we show that the probability that Case 2 occurs is exponentially small in n, the number of qubits comprising |x θ , provided the number of queries to the token is at most cn for any c < 0.114. Specifically, we show that for an arbitrary m-query strategy (i.e., any quantum strategy allowed by quantum mechanics, whether efficiently implementable or not, which queries the token at most m times), the probability of Case 2 occurring is at most O(2 2m−0.228n ). This concludes the proof.

Security analysis for the token: Intuition
Our simulation proof showing statistical security of our Quantum OTM construction of Section 3.1 relies crucially on Theorem 3.2, stated below. As the proof of this theorem uses quantum information theoretic and semidefinite programming techniques (as opposed to cryptographic techniques), let us introduce notation in line with the formal analysis of Appendix C.
With respect to the construction of Section 3.1, let us replace each two-tuple (x, θ) ∈ {0, 1} n ×{+, ×} n by a single string z ∈ {0, 1} 2n , which we denote the secret key. Bits 2i and 2i + 1 of z specify the basis and value of conjugate coding qubit i for i ∈ {1, . . . , n} (i.e., z 2i = θ i and z 2i+1 = x i ). Also, rename the "quantum key" (or conjugate coding key) |ψ z := |x θ ∈ (C 2 ) ⊗n . Thus, the protocol begins by having the sender pick a secret key z ∈ {0, 1} 2n uniformly at random, and preparing a joint state The first register, R, is sent to the receiver, while the second register, T , is kept by the token. (Thus, the token knows the secret key z, and hence also which |ψ z the receiver possesses.) The mixed state describing the receiver's state of knowledge at this point is given by Theorem 3.2. Given a single copy of ρ R , and the ability to make m (adaptive) queries to the hardware token, the probability that an unbounded quantum adversary can force the token to output both bits s 0 and s 1 scales as O(2 2m−0.228n ).
Thus, the probability of an unbounded adversary (i.e., with the ability to apply the most general maps allowed in quantum mechanics, trace-preserving completely positive (TPCP) maps, which are not necessarily efficiently implementable) to successfully cheat using m = cn for c < 0.114 queries is exponentially small in the quantum key size, n. The proof of Theorem 3.2 is in Appendix C. Its intuition can be sketched as follows.
Proof intuition. The challenge in analyzing security of the protocol is the fact that the receiver (a.k.a. the user) is not only given adaptive query access to the token, but also a copy of the quantum "resource state" ρ R , which it may arbitrarily tamper with (in any manner allowed by quantum mechanics) while making queries. Luckily, the GW framework [GW07] (Section 2.1)) is general enough to model such "queries with quantum side information". The framework outputs an SDP, Γ (Equation (17)), the optimal value of which will encode the optimal cheating probability for a cheating user of our protocol. Giving a feasible solution for Γ will hence suffice to upper bound this cheating probability, yielding Theorem 3.2.
Coherently modeling quantum queries to the token. To model the interaction between the token and user, we first recall that all queries to the token must be classical by assumption. To model this process coherently in the GW framework, we hence imagine (solely for the purposes of the security analysis) that the token behaves as follows: 1. It first sends state ρ R to the user.
2. When it receives as ith query a quantum state ρ i from the user, it sends response string r i to the user, and "copies" ρ i via transversal CNOT gates to a private memory register W i , along with r i . It does not access ρ i again throughout the protocol, and only accesses r i again in Step 3. For clarity, the token runs a classical circuit, and in the formal setup of Appendix C (see Remark (C.2)), the token conditions each response r i solely on the current incoming message, ρ i .
3. After all rounds of communication, the token "measures" its stored responses (r 1 , . . . , r m ) in the Z-basis to decide whether to accept (user successfully cheated 10 ) or reject (user failed to cheat).
The "copying" phase of Step 2 accomplishes two tasks: First, since the token will never read the "copies" of ρ i again, the principle of deferred measurement [NC00] implies the transversal CNOT gates effectively simulate measuring ρ i in the standard basis. In other words, without loss of generality the user is reduced to feeding a classical string y to the token. Second, we would like the entire security analysis to be done in a unified fashion in a single framework, the GW framework. To this end, we want the token itself to "decide" at the end of the protocol whether the user has successfully cheated (i.e. extracted both secret bits). Storing all responses r i in Step 2 allows us to simulate such a final measurement in Step 3. We reiterate that, crucially, once the token "copies" ρ i and r i to W i , it (1) never accesses (i.e. reads or writes to) ρ i again and (2) only accesses r i again in the final standard basis measurement of Step 3.
Together, these ensure all responses r i are independent, as required for a stateless token. A more formal justification is in Remark C.2 of Appendix C.
Formalization in GW framework. To place the discussion thus far into the formal GW framework, we return to Figure 1. The bottom "row" of Figure 1 will depict the token's actions, and the top row the user's actions. As outlined above, the protocol begins by imagining the token sends initial state ρ 0 = ρ R to the user via register X 1 . The user then applies an arbitrary sequence of TPCP maps Φ i to its private memory (modeled by register Z i in round i), each time sending a query y i (which is, as discussed above a classical string without loss of generality) to the token via register Y i . Given any such query y i in round i, the token applies its own TPCP map Ψ i to determine how to respond to the query. In our protocol, the Ψ i correspond to coherently applying a classical circuit, i.e. a sequence of unitary gates mapping the standard basis to itself. Specifically, their action is fully determined by Program 1, and in principle all Ψ i are identical since the token is stateless (i.e., the action of the token in round i is unaffected by previous rounds {1, . . . , i − 1}). (We use the term "in principle", as recall from above that in the security analysis we model each Ψ i as classically copying ( y i , r i ) to a distinct private register W i .) Finally, after receiving the mth query y m in register Y m , we imagine the token makes a measurement (not depicted in Fig. 1) based on the query responses (r 1 , . . . , r m ) it returned; if the user managed to extract both s 0 and s 1 via queries, then the token "accepts"; otherwise it "rejects". (Again, we are using the fact that in our security analysis, the token keeps a history of all its responses r i , solely for the sake of this final measurement.) With this high-level setup in place, the output of the GW framework is a semidefinite program 11 , denoted Γ (see Appendix C for further details): Above, Q 1 encodes the actions of the token, i.e. the co-strategy in the bottom row of Figure 1. The variable p denotes an upper bound on the optimal cheating probability (i.e., the probability with which both s 0 and s 1 are extracted), subject to linear constraints (Equations (19)-(23)) which enforce that operator R m+1 encodes a valid co-strategy (see Section 2.1). Theorem 9 of [GW07] now says that the minimum p above encodes precisely the optimal cheating probability for a user which is constrained only by the laws of quantum mechanics. Since Γ is a minimization problem, to upper bound the cheating probability it hence suffices to give a feasible solution (p, R 1 , . . . , R m+1 , P 1 , . . . , P m+1 ) for Γ, which will be our approach.
Intuition for Q 1 and an upper bound on p. It remains to give intuition as to how one derives Q 1 in Γ, and how an upper bound on the optimal p is obtained. Without loss of generality, one may assume that each of the token's TPCP maps Ψ i are given by isometries (We omit the first isometry which prepares state ρ 0 in our discussion here for simplicity.) Let us denote their sequential application by a single operator A := A m · · · A 1 (note: to make the product well-defined, in Equation (3) of Appendix C, one uses tensor products with identity matrices appropriately). Then, the Choi-Jamio lkowski representation of A is given by [GW07] (see Section 2.1) where we trace out the token's private memory register Z m . (The operator vec(·) reshapes matrix A into a vector; its precise definition is given in Section 2.1.) However, since in our security analysis, we imagine the token also makes a final measurement via some POVM Λ = {Λ 0 , Λ 1 }, whereupon obtaining outcome Λ 1 the token "accepts", and upon outcome Λ 0 the token rejects, we require a slightly more complicated setup. Letting B 1 := Λ 1 A, we define Q 1 as [GW07] Q 1 = Tr Zm (vec(B 1 ) vec(B 1 ) † ).
The full derivation of Q 1 in our setting takes a few steps (App. C). Here, we state a slightly simplified version of Q 1 for exposition with intuition: Above, recall each string r i denotes the response of the token given the ith query y i from the user; hence, the corresponding projectors in Q 1 act on spaces X 2 through X m+1 . We say r is "successful" if it encodes the user successfully extracting both secret bits from the token. Each string y i ∈ {0, 1} n+1 denotes the ith query sent from the user to the token, where each y i = b i • y i in the notation of Program 1, i.e. b i ∈ {0, 1} is the choice bit for each query. Each such message is passed via register Y i . The states |ψ z and strings z are defined as in the beginning of Section 3.4; recall z ∈ {0, 1} 2n and |ψ z ∈ (C 2 ) ⊗n denote the secret key and corresponding quantum key, respectively. The inner summation is over all messages y and keys z such that the token correctly returns response r i given both y i and z. Upper bounding p. To now upper bound p, we give a feasible solution R m+1 satisfying the constraints of Γ. Note that giving even a solution which attains p = 1 for all n and m is non-trivial -such a solution is given in Lemma C.3 of Appendix C.1. Here, we give a solution which attains p ∈ O(2 2m−0.228n ), as claimed in Theorem 3.2 (and formally proven in Theorem C.5 of Appendix C.1). Namely, we set where intuitively N is the total number of strings r corresponding to successful cheating, and recall n is the key size. This satisfies constraint (19) of Γ due to the identity term I Y1⊗···⊗Ym . The renormalization factor of (N 2 n ) −1 above ensures that tracing out all X i registers yields R 0 = 1 in constraint (21) of Γ. We are thus reduced to choosing the minimum p such that constraint (18) is satisfied. Note that setting p = 1 will not work for large enough m for this choice of R m+1 . To see why, observe we have chosen R m+1 to align with the block-diagonal structure of Q 1 on registers X 2 , . . . , X m . Since registers Y 1 ⊗ · · · ⊗ Y m and X 1 of R m+1 are proportional to the identity matrix, it thus suffices to characterize the largest eigenvalue of Q 1 , λ max (Q 1 ). This is done by Lemma C.4 of Appendix C.1, which says Combining this bound on λ max (Q 1 ) with the parameters of R m+1 above now yields the desired claim that p ∈ O(2 2m−0.228n ). For (say) m ≥ n this bound is vacuous, and thus does not suffice to show even the trivial bound p ≤ 1 for all m, as stated. (See Lemma C.3 for a feasible solution attaining p ≤ 1 for all m.) However, for m < 0.114n queries, the bound is fruitful, yielding the probability that a user of the token successfully cheats and thus that the simulation fails is exponentially small in the key size, n.
Simplifications of the GW SDP, the derivation of its dual SDP, and a conjectured approximately optimal dual feasible solution are given in Appendix D.

Impossibility Results
We now discuss "tightness" of our protocol with respect to impossibility results. To begin, it is easy to argue that OTMs cannot exist in the plain model (i.e., without additional assumptions) in both the classical and quantum settings: in the classical setting, impossibility holds, since software can always be copied. Quantumly, this follows by a simple rewinding argument [BGS13]. Here, we give two simple no-go results for the quantum setting which support the idea that our scheme is "tight" in terms of the minimality of the assumptions it uses. Both results assume the token is reversible, meaning the receiver can run both the token and its inverse operation. The results can be stated as: 1. A stateless token which can be queried in superposition cannot be used to securely construct an OTM (Section 4.1).
2. For measure and access schemes such as ours, in order for a stateless token to allow statistical security, it must have an exponential number of keys per secret bit (Section 4.2).
Note that if, on the other hand, the receiver is not given access to the token's inverse operation, it is unlikely for a straightforward adaption of our no-go techniques to go through. This is because, in the most general case where the token is an arbitrary unitary U , which the receiver may apply as a black box, simulating U −1 = U † appears difficult. For example, Theorem 3 of Quintino, Dong, Shimbo, Soeda, and Murao [Qui+19] shows that any exact implementation of U † (even with an adaptive protocol) which (1) succeeds with probability p > 0 and (2) where p is independent of the choice of U , requires k ≥ d − 1 uses of U . In our setting, d is exponential in the number of qubits, and thus so is k. Indeed, inverting arbitrary U would entail, as a special case, inverting arbitrary classical permutations, which appears difficult. For example, Fefferman and Kimmel [FK18] use precisely this idea (i.e. an in-place permutation oracle, to which one does not have access to the inverse) to prove an oracle separation between two quantum generalizations of NP, Quantum-Classical Merlin Arthur and Quantum Merlin Arthur. We stress, however, that the works of [Qui+19; FK18] are for rather general unitaries U , whereas here we have a very specific choice of U (i.e. the token's implementation). For such a specialized U , it remains possible that a no-go theorem could still hold, even without black-box access to U † .

Impossibility: Tokens which can be queried in superposition
In our construction, we require that all queries to the token be classical strings, i.e., no querying in superposition is allowed. It is easy to argue via a standard rewinding argument that relaxing this requirement yields impossibility of a secure OTM, as long as access to the token's adjoint (inverse) operation is given, as we now show. Specifically, let M be a quantum OTM implemented using a hardware token. Since the token access is assumed to be reversible, we may model it as an oracle 12 O f realizing a function f : {0, 1} n → {0, 1} m in the standard way, i.e., for all y ∈ {0, 1} n and b ∈ {0, 1} m , O f |y |b = |y |b⊕f (y) . Now, suppose our OTM stores two secret bits s 0 and s 1 , and provides the receiver with an initial state |ψ ∈ A ⊗ B ⊗ C, where A, B, and C are the algorithm's workspace, query (i.e., input to O f ), and answer (i.e., O f 's answers) registers, respectively. By definition, an honest receiver must be able to access precisely one of s 0 or s 1 with certainty, given |ψ . Thus, for any i ∈ {0, 1}, there exists a quantum query algorithm For any choice of i, however, this implies a malicious receiver can now classically copy s i to an external register, and then "rewind" by applying A † i to |ψ AB |s i C to recover |ψ . Applying A i for i = i to |ψ now yields the second bit i with certainty as well. We conclude that a quantum OTM which allows superposition queries to a reversible stateless token is insecure.

Remark 4.1. Above, we assumed the OTM outputs s i with certainty. The argument generalizes to
OTMs that output s i with probability at least 1 − for small > 0; for this, the Gentle Measurement Lemma [Win99] can be used to show that both bits can be recovered with non-negligible probability.

Remark 4.2. Our argument crucially relies on the fact that the receiver has superposition access to the A †
i operation. In certain models (e.g., software), such access is unavoidable. However, we do not rule out the possibility that non-reversible superposition access to a token would allow for quantum OTMs.

Impossibility: Tokens with a bounded number of keys
We observed superposition queries to the token prevent an OTM from being secure. One can also ask how simple a hardware token with classical queries can be, while still allowing a secure OTM. Below, we consider such a strengthening in which the token is forced to have a bounded number of keys.
To formalize this, we define the notion of a "measure-and-access (MA)" OTM, i.e., an OTM in which given an initial state |ψ , an honest receiver applies a prescribed measurement to |ψ , and feeds the resulting classical string (i.e., key) y into the token O f to obtain s i . Our construction is an example of a MA memory in which each bit s i has an exponential number of valid keys y such that f (y) = s i . Can the construction can be strengthened such that each s i has a bounded number (e.g., a polynomial number) of keys? We now show that such a strengthening would preclude security, assuming the token is reversible.
For clarity, implicitly in our proof below, we model the oracle O f as having three possible outputs: 0, 1, or 2, where 2 is output whenever O f is fed an invalid key y. This is required for the notion of having "few" keys to make sense (i.e., there are 2 n candidate keys, and only two secret bits, each of which is supposed to have a bounded number of keys). Note that our construction indeed fits into this framework. We conclude that if a secret bit b i has (say) at most polynomially many keys, then any measure-andaccess OTM can be broken with at least inverse polynomial probability. The proof is given in Appendix E. In this sense, at least in the paradigm of measure-and-access memories, our construction is essentially tight -in order to bound the adversary's success probability of obtaining both secret bits by an inverse exponential, we require each secret bit to have exponentially many valid keys. Note that, as in the setting of superposition queries, the above proof can be generalized to the setting in which the OTM returns the correct bit s i with probability at least 1 − for small > 0. Finally, the question of whether a similar statement to Lemma 4.3 holds for a non-reversible token remains open.

A.1 Classical UC Model ([Can01])
Machines. The basic entities involved in the UC model are players P 1 , . . . , P k where k is polynomial of security parameter n, an adversary A, and an environment Z. Each entity is modeled as a interactive Turing machine (ITM), where Z could have an additional non-uniform string as advice. Each P i has identity i assigned to it, while A and Z have special identities id A := adv and id Z := env.

Protocol Execution.
A protocol specifies the programs for each P i , which we denote as π = (π 1 , . . . , π k ). The execution of a protocol is coordinated by the environment Z. It starts by preparing inputs to all players, who then run their respective programs on the inputs and exchange messages of the form (id sender , id receiver , msg). A can corrupt an arbitrary set of players and control them later on. In particular, A can instruct a corrupted player sending messages to another player and also read messages that are sent to the corrupted players. During the course of execution, the environment Z also interacts with A in an arbitrary way. In the end, Z receives outputs from all the other players and generates one bit output. We use EXEC[Z, A, π] to denote the distribution of the environment Z's (single-bit) output when executing protocol π with A and the P i 's.

Ideal Functionality and Dummy Protocol.
Ideal functionality F is a trusted party, modeled by an ITM again, that perfectly implements the desired multi-party computational task. We consider an "dummy protocol", denoted P F , where each party has direct communication with F, who accomplishes the desired task according to the messages received from the players. The execution of P F with environment Z and an adversary, usually called the simulator S, is defined analogous as above, in particular, S monitors the communication between corrupted parties and the ideal functionality F. Similarly, we denote Z's output distribution as EXEC[Z, S, P F ].

EXEC[Z, A, π] ≈ EXEC[Z, S, π ] (30)
We here consider that A and Z are computationally unbounded, and we call it statistical UC-security. We require the running time S is polynomial in that of A. We call this property Polynomial Simulation.
Let F be a well-formed two party functionality. We say π (classically) UC-realizes F if for all adversary A, there exists a simulator S such that for all environments Z, EXEC[Z, A, π] ≈ EXEC[Z, S, P F ]. We also write EXEC[Z, A, π] ≈ EXEC[Z, S, F] if the context is clear.
UC-secure protocols admit a general composition property, demonstrated in the following universal composition theorem.

A.2 Quantum UC Model ([Unr10])
Now, we give a high-level description of quantum UC model by Unruh [Unr10].

Quantum Machine.
In the quantum UC model, all players are modeled as quantum machines. A quantum machine is a sequence of quantum circuits {M n } n∈N , for each security parameter n. M n is a completely positive trace preserving operator on space H state ⊗ H class ⊗ H quant , where H state represents the internal workspace of M n and H class and H quant represent the spaces for communication, where for convenience we divide the messages into classical and quantum parts. We allow a non-uniform quantum advice 13 to the machine of the environment Z, while all other machines are uniformly generated.

Protocol Execution. In contrast to the communication policy in classical UC model, we consider a network N which contains the space H
. Namely, each machine maintains individual internal state space, but the communication space is shared among all . We assume H class contains the message (id sender , id receiver , msg) which specifies the sender and receiver of the current message, and the receiver then processes the quantum state on H quant . Note that this communication model implicitly ensures authentication. In a protocol execution, Z is activated first, and at each round, one player applies the operation defined by its machine M n on H class ⊗ H quant ⊗ H state . In the end Z generates a one-bit output. Denote EXEC[Z, A, Π] the output distribution of Z.
Ideal Functionality. All functionalities we consider in this work are classical, i.e., the inputs and outputs are classical, and its program can be implemented by an efficient classical Turing machine. Here in the quantum UC model, the ideal functionality F is still modeled as a quantum machine for consistency, but it only applies classical operations. Namely, it measures any input message in the computational basis to get a classical bit-string, and implements the operations specified by the classical computational task.
We consider an "dummy protocol", denoted P F , where each party has direct communication with F, who accomplishes the desired task according to the messages received from the players. The execution of P F with environment Z and an adversary, usually called the simulator S, is defined analogous as above, in particular, S monitors the communication between corrupted parties and the ideal functionality F.

B Stand-Alone Security in the case of a Malicious Sender
In order to define stand-alone security against a malicious sender (Definition B.2), in our context, we closely follow definitions given in prior work [DNS10], which we now recall. (Note that, instead of considering the approximate case for security, we are able to use the exact one.) 3. An n-tuple of quantum operations A 1 , . . . , A n and B 1 , . . . , B n can be written as

Memory spaces
If Π O = (A , B, O, n) is an n-turn two-party protocol, then the final state of the interaction upon input ρ in ∈ D(A 0 ⊗ B 0 ⊗ R) where R is a system of dimension dim A 0 dim B 0 , is: As in [DNS10], we specify that an oracle O can be a communication oracle or an ideal functionality oracle.
An adversaryÃ for an honest party A in Π O = (A , B, O, n) is an n-tuple of quantum operations matching the input and outputs spaces of A . A simulator forÃ is a sequence of quantum operations where S i has the same input-output spaces as the maps ofÃ at step i. In addition, S has access to the ideal functionality for the protocol Π.
We note that Definition B.2 is weaker than some other definitions for active security used in the literature, e.g., [DNS12], because we ask only that the local view of the adversary be simulated. Given the simple structure of our protocol and ideal functionality, the construction and proof of the simulator is straightforward as shown below. Theorem B.3. Protocol Π is statistically stand-alone secure against a corrupt sender.
Proof. Since Π consists in a single message from the sender to the receiver (together with a call to the ideal functionality for the token), we have that A = (A 1 ). Furthermore, since the ideal functionality F wrap does not return anything to the sender, there is no need for our simulator S to call an ideal functionality.
We thus build S that runs A on the input in register A 0 . When A calls the F wrap ideal functionality, the simulator does nothing. Since Π is a one-way protocol, and since the ideal functionality also does not allow communication from the receiver to the sender, This concludes the proof.

C Security Analysis for the Token
We now provide the technical result (Theorem 3.2) that is used to prove security of our Quantum OTM construction of Section 3.1 against a linear number of queries. The statement below is informal; as outlined in Section 3.4, to make it formal, in Section C.1 we model the user's interaction with the token via the Gutoski-Watrous (GW) framework for quantum games [GW07]. The resulting formal statement we desire, which immediately yields the informal claim below, is given in Theorem C.5. Thus, we are able to prove that if the user makes at most m = cn queries with c < 0.114, then the user's probability of cheating successfully is exponentially small in n.

C.1 Security against a linear number of token queries: Primal SDP
To show security of our hardware token implementation (Program 1) against a linear number of queries, we now model a user's interaction with the token as an interactive game between two parties using the GW framework of Section 2.1. As outlined in Section 3.4, we shall treat the token as the co-strategy and the user as the strategy. An overview of how all operators introduced below fit together is given in Figure 3, which may be periodically referred to as the reader progresses through this section.
Basics of our model. We proceed as follows. As depicted in Figure 1, the token (co-strategy) begins by preparing state ρ 0 ∈ L(X 1 ⊗ W 0 ), and sending message X 1 (which contains ρ R from Equation (15) to the user. The user then makes m queries, each via a distinct register Y i for i ∈ {1, . . . , m}. For each query made, we model the token as returning two strings: (1) a symbol in set Σ = 0, 1, 0, 1 where 0 and 1 denote successful 0and 1-queries, respectively, and 0 and 1 denote unsuccessful 0and 1-queries, respectively, and (2) a bit b which is set to 0 for a failed query, or secret bit b i for a successful ith query. Formally, the size of each register X i for i ≥ 2 is hence three qubits. We will deviate from Figure 1 in one respect -we assume the token also returns the response to the final query, m, via a register Y m+1 ; this does not affect the success or failure of the user (as the latter makes no further queries at this point), but helps streamline the analysis. After this last response is sent out, the token measures the string s ∈ Σ m of responses it sent back to the user, and "accepts" if and only if s contains at least one 0 and one 1. This will be spelled out formally below once we defined the isometries A i for the protocol. Before doing so, let us introduce the terminology used in this section for discussing the secret key held by the token. Namely, recall in Program 1 that the token keeps secret key data x ∈ {0, 1} n and θ ∈ {+, ×} n . Here, we shall replace these by a single string z ∈ {0, 1} 2n , such that bits 2i and 2i + 1 of z specify the basis and value of conjugate coding qubit i, for i ∈ {1, . . . , n} (i.e. z 2i = θ i and z 2i+1 = x i ). We shall call z the secret key. For consistency, we shall rename the quantum key |x θ from Program 1 by |ψ z ∈ (C 2 ) ⊗n , i.e. |x θ = |ψ z . Next, in Program 1 the token takes inputs b ∈ {0, 1} and y ∈ {0, 1} n , for b the choice bit and y the claimed measured value. In this section, we shall simply concatenate these as one string y = b • y ∈ {0, 1} n+1 (we henceforth write y = by for brevity), the first bit of which is

Contents
Register W    y1 and r( y1, z). Here, r( y k , z) ∈ Σ denotes whether the token accepted or rejected query string y k assuming secret key z. Note the secret key z is passed along from round to round (otherwise, the token cannot correctly decide its response in a round k given query string y k ).
the choice bit. We shall refer to y as a query string. With these definitions in hand, for each secret key z ∈ {0, 1} 2n , we define a partition A 0 (z), A 1 (z), A 0 (z), A 1 (z) of {0, 1} n+1 , which correspond to the sets of query strings y which cause the token to return response 0, 1, 0, or 1, respectively.
Defining the isometries A k . Recall from Section 2.1 that the GW model begins by capturing the actions of a co-strategy as a sequence of linear isometries, A k . To define these A k , we first construct operators ∆ k (z) : Y k → X k+1 ⊗ W k,k+1 (i.e. which map an incoming message in Y k to the token to an outgoing message in X k+1 and private data to store in W k,k+1 ) for k ∈ {1, . . . , m} as follows: The intuition is as follows. In round k, we model the token as (coherently) making the following classical computation: Upon input | y Y k from the user (which consists of a choice bit b and candidate key y), the token sends its response in X k+1 to the user (the first symbol of which denotes accept/reject via a symbol from Σ, and the second symbol of which is the corresponding secret bit s, which is set to 0 by default for failed queries), and classically copies (i.e. via transversal CNOT gates) both the input y and response from Σ into W k (the private memory of the token). Recall from Section 3.4 that coherently keeping this local copy of y, which is never accessed again, simulates a measurement of Y k in the standard basis (by the principle of deferred measurement [NC00]). The response from Σ is also locally stored in W k , solely for the token to be able to decide at the end of the protocol whether the user successfully extracted both secret bits. (More details on this below after the isometries A k are defined.) Before finally defining isometries A k , let us further elaborate on how the token's private memory spaces W k is modelled (illustrated in Figure 2).
• W 0 contains the secret key z ∈ {0, 1} 2n of the token (i.e. the token knows what the secret key is).
• Each W k register for k > 0 is split into k + 1 parts: -W k,1 contains a copy of z (this allows us to pass forward z from one round of interaction to the next, i.e. the token should know the secret key in all rounds), and Operator Description ∆ k (z) 1. Sends back token's response to user's kth query y k , conditioned on key z 2. Copies all data sent above to token's private register 3. Forwards token's existing private memory contents to next round A k "Bootstraps" ∆ k (z) by summing over all possible keys z Note A 0 also has special role of sending quantum key |ψ z to user The operator A projected down onto the space of all "accepting" strategies, i.e. where the token's private memory in the last round, W m , contains |0 and |1 in some W i and W j for i = j, respectively The operator B 1 is reshaped into a column vector via vec() mapping, then the token's private memory in the last round, W m , is traced out. -W k,r for r ≥ 2 contains a copy in the standard basis of the user's (r − 1)st query string (string y), as well as the token's response from Σ for said query.
Note that an additional technical reason for storing y above is that it ensures ∆ k (z) † ∆ k (z) = I, so that each A i defined shortly is an isometry. We remark that while the size of W grows with m in our security analysis here, the actual token does not have growing memory requirements, since it stores nothing other than the secret key z in its private memory (i.e. registers W k,r for r ≥ 2 exist only for our security analysis, not the actual implementation of the token).
We are now ready to define isometries A k for round k of the token's actions, where 1 < k ≤ m: Here, A 0 : C → X 1 ⊗ W 0 , and A k : The intuition is as follows: • A 0 captures the token choosing an initial secret key z uniformly at random and preparing corresponding quantum key |ψ z , which it sends to the user in register X 1 .
• Each A i for 1 ≤ k ≤ m consists of terms ∆ k (z) and |z z|. The latter simply copies forward the secret key z from round i − 1 to i from private register W k−1,1 to W k,1 , ensuring the token always knows z. Recall the term ∆ k (z), defined in Equation (35), captures the token reading a message y from the user in Y k , measuring it in the standard basis (simulated by copying string y to a private register W k,k+1 ), returning an appropriate response to the user in register X k+1 , and storing a copy of the kth response from Σ to the user in the private register W k,k+1 . , where the only "bra" vector is y| Y k , meaning the corresponding response |00 X k+1 depends only on y and the string z (since the term 0 depends on the summation criterion y ∈ A 0 (z), which depends on z). Thus, the effective interactive behavior of the token is indeed stateless, as desired.

Remark C.2. It is in the definition of the
Combining isometries A i to get A. Having defined isometries A i , their product now yields operator A from Equation (3) (where we reorder the X and W registers to clarify that incoming message Y k results in outgoing message X k+1 ): where r( y, z) ∈ Σ denotes whether the token accepted or rejected query string y assuming secret key z, and s r( y,z) ∈ {0, 1} is the secret bit returned by the token corresponding to response r( y, z) ∈ Σ (recall we set s r( y,z) = 0 if r( y, z) ∈ 0, 1 ).
Defining operator Q 1 . In order to next define operator Q 1 from Equation (6), we model what it means for a cheating user of the token to "succeed". As mentioned earlier, this is formalized by having the token make a final measurement on its private memory after the protocol concludes, in order to determine whether the user has successfully extracted both secret bits via queries. Formally, for convenience, let W denote the tensor product of the registers in W m,k for 2 ≤ k ≤ m + 1, which hold the values from Σ (i.e., the responses r( y k−1 , z)). Then, a successful user makes at least one correct 0-query and at least one correct 1-query (where a j-query refers to a query for choice bit j). We define the "accepting" measurement operator Λ 1 , corresponding to a successful user, as follows. Λ 1 maps W to itself, and is a projector onto the set of strings with some i = j such that W i is set to |0 and W j is set to |1 . In other words, Λ 1 projects onto set T := {t ∈ Σ m | t contains at least one 0 and one 1}. (45) To use this definition of Λ 1 to write down B 1 , we require further terminology. Define for any t ∈ T and fixed key z ∈ {0, 1} 2n , the set of all consistent sequences of query strings y i ∈ {0, 1} n+1 as: 2n | r( y i , z) = t i for y i the ith block of (n + 1) bits in y .
(For example, the second block of (n + 1) bits of 0 n+1 1 n+1 is 1 n+1 .) In words, Y t is the set of all strings y 1 y 2 · · · y m z such that the response of the token on query i, r( y i , z) ∈ Σ, is consistent with t i . Using this, In words, a triple (t, y, z) ∈ R if for a secret key z and query string y, t ∈ T ⊆ Σ m encodes the correct set of m responses from the token (where recall T is the set of all "successful" response strings).
Recall from Equation (6) that to define Q 1 , we required B 1 , which in turn required Λ 1 and A. With the latter two in hand, we can now define B 1 = ( |t m s tm Xm+1 y m | Ym ⊗ |t m−1 s tm−1 Xm y m−1 | Ym−1 ⊗ · · · ⊗ |t 1 s t1 X2 y 1 | Y1 ⊗ (49) By definition of the vec mapping (Section 2.1), vec(B 1 ) = 1 2 n (t, y,z)∈R | y 1 t 1 Wm,2 ⊗ · · · ⊗ | y m t m Wm,m+1 ⊗ (51) Finally, Q 1 = Tr Wm (vec(B 1 ) vec(B 1 ) * ) equals Note that we have crucially used the fact that queries to the token are classical strings. Namely, since the token implicitly measures its input in the standard basis (modelled by copying each string y i to register W i ), the partial trace over W m annihilates all cross terms in vec(B 1 ) vec(B 1 ) * . Thus, Q 1 is conveniently simplified to a mixture over (t, y, z) ∈ R, which is further block-diagonal with respect to all registers other than X 1 . For convenience, we permute subsystems to rewrite: The SDP. Having set up all required operators for the GW framework, Equation (7) of Section 2.1 now yields the optimal probability with which a cheating user can succeed; we reproduce Equation (7) below for convenience. Note the subsystem ordering of Q 1 below is not that of Equation (57), but rather Q 1 ∈ Pos(Y 1,...,m ⊗ X 1,...,m+1 ) below; we have omitted explicitly including the permutation effecting this reordering to avoid clutter. Also, to account for the slight asymmetry in our protocol (the token sends out m + 1 messages X i , whereas the user only sends m messages Y i ), we add a dummy space Y m+1 = C which models an empty (m + 1)st message from the user to the token. min: p (58) subject to: Q 1 pR m+1 (59) In the analysis below, we shall sometimes analyze the optimization above, which we shall denote Γ . However, note that technically it is not yet an SDP due to the quadratic constraint Q 1 pR m+1 . It is, however, easily seen to be equivalent to the following bona fide SDP Γ: R k ∈ Pos(Y 1,...,k ⊗ X 1,...,k ) for 1 ≤ k ≤ m + 1 (71) P k ∈ Pos(Y 1,...,k−1 ⊗ X 1,...,k ) for 1 ≤ k ≤ m + 1 (72) Above and henceforth, we use terminology T 1···k to denote the space T 1 ⊗ · · · ⊗ T k .
Warmup: A "trivial" solution. We mentioned in Section 3.4 that obtaining a solution to Γ which obtains the trivial bound p ≤ 1 is not trivial. (Sometimes with SDPs, a scaled identity operator gives a feasible solution obtaining the desired trivial bound on the objective value; this unfortunately does not work here.) Let us hence warm up by demonstrating a solution attaining the trivial bound p ≤ 1.
Lemma C.3. The SDP Γ has a feasible solution with p = 1.
Proof. Recall from Equation (57) that where t ∈ T ⊆ Σ m and s ∈ {0, 1} m are the resulting query responses and secret bits, respectively. (Recall from Equation (57) that, formally, we should write s t , as each s depends on t; to save clutter and space below, however, we drop the subscript.) Observe that any fixed y ∈ {0, 1} m(n+1) and z ∈ {0, 1} 2n determine a unique query response string t ∈ Σ m (which may or may not be in T ); denote this as t( y, z). Therefore, for T ⊆ Σ m defined as in Equation (45). Let us drop the constraint that t( y, z) ∈ T , i.e. choose Clearly, Q 1 p · R m+1 for p = 1, since we added positive semidefinite terms to Q 1 to get R m+1 . Thus, if R m+1 satisfies the remaining primal constraints, then it has objective function value p = 1. To see that R m+1 satisfies the constraints, clearly R m+1 has I in register Y m+1 (recall Y m+1 = C, so this just means Y m+1 is trivially set to 1). Let us now trace out X m+1 ; we require that register Y m−1 now also contains the identity. For this, Tr Xm+1 (R m+1 ) equals: where for brevity we use t m−1 ( y, z)s m−1 to denote the first m − 1 queries. But since we discarded the mth symbol of t( y, z), registers Y m and X 1 are now independent. Thus, bringing in the sum over y m , In a similar fashion, tracing out registers X m···2 will yield operator Finally, tracing out X 1 yields I Ym···1 , since there are 4 n possible quantum key states |ψ z . Hence, R m+1 is a feasible solution.
An upper bound on the cheating probability. We now give a feasible solution to SDP Γ which yields the claimed security against a linear number of queries. Its proof of correctness relies on Lemma C.4, which we state and prove first.
Lemma C.4. For Q 1 in Equation (57), λ max (Q 1 ) = 2 4 n 1 + 1 Proof. The factor of 4 −n in the claimed value for λ max (Q 1 ) comes from the 4 −n appearing in Equation (57); we henceforth thus ignore this 4 −n term in this proof by redefining Q 1 as 4 n Q 1 . We shall also ignore the b i terms in Q 1 , as they shall play no role in the analysis. Now, since Q 1 is block-diagonal (with respect to the standard basis) on registers X 2 ,. . . ,X m+1 , Y 1 ,. . . ,Y m , it suffices to characterize the largest eigenvalue of any block. We shall say that any fixed t ∈ T and y ∈ {0, 1} m(n+1) defines the (t, y)-block of Q 1 . (Formally, the (t, y)-block of Q 1 is given by Π t, y Q 1 Π t, y , where Π t, y = |t t| Xm+1···2 ⊗ | y y| Y .) Lower bound. We first show lower bound λ max (Q 1 ) ≥ 2 4 n (1 + 1 √ 2 ) n . To do so, we demonstrate an explicit t, y such that the (t, y)-block has eigenvalue 2 4 n (1 + 1 √ 2 ) n . Set t = 0 m−1 1 (note t ∈ T ) and y = y 1 . . . y m for y 1 = y 2 = · · · = y m−1 and y m−1 = y m (note y i ∈ {0, 1} n+1 ), where the first bit of each of y 1 , . . . , y m−1 is 0, and the first bit of y m is 1. In words, we are modelling m − 1 successful (and identical) 0-queries in the Z-basis, followed by a single successful 1-query in the X-basis. The question now is: Given t and y, how many |ψ z ∈ (C 2 ) ⊗n exist such that (t, y, z) ∈ R?
Upper bound. We next show a matching upper bound of λ max (Q 1 ) ≤ 2 4 n (1 + 1 √ 2 ) n among all (t, y)blocks. For any t ∈ T , there exist indices i = j such that y i and y j are a successful 0-and 1-query, respectively. Without loss of generality, assume i = 1 and j = m. Then, as in the previous case, rules 1 and 2 imply that: Consider now any y i for 1 < i < m, and suppose without loss of generality that y i is a 0-query, i.e. its first bit is set to 0. There are two cases to analyze: • (Case 1: t i = 0) In this case, both query 1 and query i are successful 0-queries; thus, they must agree on all secret key bits which were encoded in the Z basis. It follows from Rule 1 that for any bit k on which y 1 and y i disagree, the secret key must have encoded bit k in the X-basis. In other words, |ψ z (k) = H| y m (k) in Equation (81) (i.e. one of the two possibilities is eliminated). (If y 1 = y i , on the other hand, no such additional constraint exists.) • (Case 2: t i = 0) In this case, query i is an unsuccessful 0-query. By Rule 3, there exists a bit k on which y 1 and y k disagree, and whose corresponding secret key bit was encoded in the Z basis. In other words, |ψ z (k) = | y 1 (k) in Equation (81) (i.e., one of the two possibilities is eliminated).
The analysis for y i being a 1-query is analogous. We conclude that for any (t, y)-block of Q 1 , the operator in register X 1 is of the form of σ from Equation (80), except that the some of the indices k may contain an operator consisting of only 1 summand (e.g. | y 1 (k) y 1 (k)| instead of | y 1 (k) y 1 (k)| + H| y m (k) y m (k)|H).
Since the omitted summands are all positive semidefinite, however, we conclude the eigenvalue on any (t, y)-block is at most the eigenvalue of σ from Equation (80), i.e., at most λ max (Q 1 ) ≤ 2 4 n (1 + 1 √ 2 ) n , as claimed.
We can now prove the main result of this section.
Proof. As Q 1 in Equation (57) is block-diagonal in registers X 2 , . . . , X m+1 , consider solution (for T from Equation (45)) (Aside: Recall that X 1 is an n-qubit register above, hence the 2 n renormalization factor.) Note that {t ∈ Σ m | t does not contain a 0 or a 1} = 2 m .
Thus, by the inclusion-exclusion principle, |T | = 4 m − 2 · 3 m + 2 m . In order for R m+1 to be feasible, we must pick p such that Q 1 pR m+1 . Since Q 1 is block-diagonal on registers X 2 · · · X m+1 , it suffices to identify its block with the largest eigenvalue. In fact, each corresponding block for R m+1 has eigenvalue (|T | 2 n ) −1 . Thus, we must choose p such that or equivalently, due to the 4 −n factor in Q 1 , p ≥ |T | 2 n λ max (4 n Q 1 ) .
Guiding example: m = 3. We explicitly run through the construction for the first non-trivial case, m = 3 queries. The construction then generalizes straightforwardly to all m ≥ 2. To begin, using the fact that R 4 = P 4 (since Y 4 = C, due to the fact that we assumed message m + 1 from the user to the token is empty), Γ can be written: min: Tr(P 1 ) (100) subject to: Q 1 − P 4 0 (101) − P 3 ⊗ I Y3 + Tr X4 (P 4 ) 0 (102) − P 2 ⊗ I Y2 + Tr X3 (P 3 ) 0 (103) − P 1 ⊗ I Y1 + Tr X2 (P 2 ) 0 (104) Above, we relaxed the equalities to inequalities 14 , which intuitively makes it easier to guess feasible solutions to Γ. We also omitted the positive semidefinite constraints on all P i , since 15 P 4 Q 1 0 implies P i 0 for all i. We now follow the standard Lagrange approach for deriving the dual SDP (see, e.g. [BV04]). Labelling equations (101),(102),(103),(104) with dual variables Y 1 , . . . , Y 4 , respectively, the primal variables in the Lagrange dual function can be isolated as follows:

Primal variable
Factor 14 This is without loss of generality, as we briefly justify. Clearly, any feasible solution for equality constraints is also feasible for inequality constraints. For the converse direction, suppose a feasible solution for the inequality constraints satisfies P i ⊗ I Y i+1 − Tr X i+1 (P i+1 ) = Λ i 0 for non-zero Λ i ; pick the smallest such i satisfying this condition. Then, redefining P i+1 := P i+1 + |φ φ| X i+1 ⊗ Λ i 0 for arbitrary unit vector |φ satisfies P i ⊗ I Y i+1 − Tr X i+1 (P i+1 ) = 0, as desired. Note we can recurse this trick now from constraint i to i + 1, since if P i+1 ⊗ I Y i+2 − Tr X i+2 (P i+2 ) 0, then P i+1 ⊗ I Y i+2 − Tr X i+2 (P i+2 ) 0 (similarly for constraint Q 1 P m+1 ). Thus, we obtain a new feasible solution for which all inequality constraints (except Q 1 P m+1 ) hold with equality. Moreover, this process does not alter the assignment for P 1 (i.e. we never define P 1 ); thus the objective function value remains unchanged. 15 In our particular setting, it is clear that Q 1 0. However, more generally in the GW framework, the operators {Qa} defining a measuring co-strategy all satisfy Qa 0.
For clarity and as an example, this says the term P 4 (−Y 1 + Y 2 ⊗ I X4 ) appears in the dual function. This yields dual SDP: max: Tr(Y 1 Q 1 ) (105) subject to: − Y 1 + Y 2 ⊗ I X4 = 0 (106) − Tr Y3 (Y 2 ) + Y 3 ⊗ I X3 = 0 (107) − Tr Y2 (Y 3 ) + Y 4 ⊗ I X2 = 0 (108) Now we make the following simplifications: (1) Replace Y 1 with Y 2 ⊗ I X4 (follows from Equation (106)), (2) drop the constraints Y 3 , Y 4 0 (since they are implied by Y 2 0), and (3) relax the equalities to inequalities (which follows similar to the argument for the primal, except here we also require that we are maximizing with respect to Y 2 below). Hence, we obtain: max: Tr(Y 2 Tr X4 (Q 1 )) (111) Note the Y 2 0 cannot be removed. Intuitively, this is because the constraint I X1 − Tr Y1 (Y 4 ) 0 alone does not imply Y 4 0. Rather, it is the constraint Y 2 0 which forces Y 4 0 here. Indeed, a sanity check in CVX for Matlab reveals removing the Y 2 0 incorrectly yields an unbounded SDP.
We now repeat the process by taking the dual of the dual to arrive at a simplified primal as follows. (Note that the inequalities above now go in the other direction, since we are starting from the dual SDP.) Labelling the constraints above R 1 , . . . , R 4 , we have factor table:

Dual variable
Factor This yields primal SDP (after omitting the redundant constraints R 1 , R 2 , R 3 , R 4 0): subject to:Tr X4 (Q 1 ) − R 1 ⊗ I Y3 0 (117) Tr X2 (R 2 ) − R 3 ⊗ I Y1 0 (119) Taking the dual of the primal now yields the previous dual; so it seems we are done. Relabelling variables for the primal and dual, we obtain the final m = 3 primal and dual SDPs, respectively: min: Tr(P 1 ) subject to: Tr X4 (Q 1 ) − P 3 ⊗ I Y3 0 Tr X3 (P 3 ) − P 2 ⊗ I Y2 0 Tr X2 (P 2 ) − P 1 ⊗ I Y1 0 max: Y 1 , Tr X4 (Q 1 ) General case. The derivation above straightforwardly extends to the case of arbitrary m ≥ 2, yielding primal and dual SDPs: Conjecture D.2. The optimal values for the primal and dual SDPs of Section D.1 are, up to multiplicative scaling by some function f (m, n) ∈ O(m c 2 (1/2− )n ) for constants c > 0 and 0 < < 1/2, equal to β = |R| 4 n d m . If Conjecture D.2 holds, then our protocol would be secure in the sense that the optimal cheating probability would scale as poly(m)/2 Θ(n) .

E Proof of Lemma 4.3
Proof. Observe first that an honest receiver Alice wishing to extract s i acts as follows. She applies a unitary U i ∈ U(A ⊗ B) to get state |φ 1 := U i |ψ AB |0 C .
She then measures B in the computational basis and postselects on result y ∈ {0, 1} n , obtaining state She now treats y as a "key" for s i , i.e., she applies O f to B ⊗ C to obtain her desired bit s i , i.e., |φ 3 := |φ y A |y B |s i C .
A malicious receiver Bob wishing to extract s 0 and s 1 now acts similarly to the rewinding strategy for superposition queries. Suppose without loss of generality that s 0 has at most ∆ keys. Then, Bob first applies U 0 to prepare |φ 1 from Equation (149), which we can express as for y |α y | 2 = 1. Since measuring B next would allow us to retrieve s 0 in register C with certainty, we have that all y appearing in the expansion above satisfy f (y) = s 0 . Moreover, since s 0 has at most ∆ keys, there exists a key y such that |α y | 2 ≥ 1/∆. Bob now measures B in the computational basis to obtain |φ 2 from Equation (150), obtaining y with probability at least 1/∆. Feeding y into O f yields s 0 . Having obtained y , we have that | φ 1 |φ 2 | 2 ≥ 1/∆, implying ψ|U † 0 |φ y |y 2 ≥ 1/∆, i.e., Bob now applies U † 0 to recover a state with "large" overlap with initial state |ψ . To next recover s 1 , define |ψ good := U 1 |ψ and |ψ approx := U 1 U † 0 |φ y |y . Bob applies U 1 to obtain |ψ approx = β 1 |ψ good + β 2 |ψ ⊥ good , where i |β i | 2 = 1, ψ good |ψ ⊥ good = 0, and |β 1 | 2 ≥ 1/∆. Define Π good := y∈{0,1} n s.t. f (y)=s1 |y y|. Then, the probability that measuring B in the computational basis now yields a valid key for s 1 is where we have used the fact that Π good |ψ good = |ψ good (since an honest receiver can extract s 1 with certainty). We conclude that Bob can extract both s 0 and s 1 with probability at least 1/∆ 2 .