On the Entanglement Cost of One-Shot Compression

We revisit the task of visible compression of an ensemble of quantum states with entanglement assistance in the one-shot setting. The protocols achieving the best compression use many more qubits of shared entanglement than the number of qubits in the states in the ensemble. Other compression protocols, with potentially larger communication cost, have entanglement cost bounded by the number of qubits in the given states. This motivates the question as to whether entanglement is truly necessary for compression, and if so, how much of it is needed. Motivated by questions in communication complexity, we lift certain restrictions that are placed on compression protocols in tasks such as state-splitting and channel simulation. We show that an ensemble of the form designed by Jain, Radhakrishnan, and Sen (ICALP'03) saturates the known bounds on the sum of communication and entanglement costs, even with the relaxed compression protocols we study. The ensemble and the associated one-way communication protocol have several remarkable properties. The ensemble is incompressible by more than a constant number of qubits without shared entanglement, even when constant error is allowed. Moreover, in the presence of shared entanglement, the communication cost of compression can be arbitrarily smaller than the entanglement cost. The quantum information cost of the protocol can thus be arbitrarily smaller than the cost of compression without shared entanglement. The ensemble can also be used to show the impossibility of reducing, via compression, the shared entanglement used in two-party protocols for computing Boolean functions.


Introduction 1.Visible compression
Compression of quantum states is a fundamental task in information processing.In the simplest setting, we have two spatially separated parties, commonly called Alice and Bob, Shima Bab Hadiashar: sbabhadi@uwaterloo.caAshwin Nayak: ashwin.nayak@uwaterloo.cawhere S is some non-empty finite set, and p is a probability distribution over S. Alice gets an input x ∈ S with probability p x , and would like to send a message, i.e., a quantum state σ x ∈ D(C d ) to Bob so that he can recover the state ρ x , or even an approximation to it.Since the input x completely specifies the corresponding state ρ x , this variant of the task is called visible compression.The communication cost of the protocol is log d, the length of the message in qubits.Their goal is to accomplish this with as short a message as possible, i.e., to minimize the dimension d.A central question in quantum information theory is whether there is a simple characterization of the optimal communication cost in terms of the "information content" of the ensemble.
An additional resource that Alice and Bob may use in compression is a shared entangled state.In other words, the two parties may start with their qubits initialized to a fixed pure quantum state independent of the input received by Alice.The local quantum operations performed for compression and decompression then also involve the respective parts of the shared state.This is depicted in Figure 1, and the protocol (or channel) is said to be with shared entanglement or entanglement assisted .As we may expect, the communication cost may decrease due to the availability of this additional resource.The entanglement cost of a protocol is the minimal dimension of the support of either party's share of the initial state (measured in qubits) required to achieve some communication cost.(We discuss the notion of entanglement cost in detail in Section 4.) We would also like to characterize the entanglement cost in this setting, in addition to the communication cost.
< l a t e x i t s h a 1 _ b a s e 6 4 = " C p y q r 1 s 2  Compression problems similar to the one above have been studied extensively in quantum information theory, both in the one-shot setting (the one we described above), and in the asymptotic setting (where the sender's input consists of multiple samples picked independently from the same distribution).The problem has been studied in early works such as Ref. [5] in the setting of quantum communication without shared entanglement.It is known as remote state preparation when allowed one-way communication over a classical channel with shared entanglement.We refer the reader to Ref. [4,Table I] for a summary of the work on remote state preparation; we describe the most relevant results-in the one-shot setting-below.
Other tasks in the literature that come close to the one above are state splitting (see, e.g., Ref. [7]), and that of channel simulation in the context of the Quantum Reverse Shannon Theorem [6,7].State splitting is the time reversal [9] of state merging [15,16], and was called the "fully quantum reverse Shannon protocol" in Ref. [9].We explain the connection to state splitting in detail in Section 2.3.
In both state splitting and channel simulation, the protocol is required to be "coherent" in specific ways.In particular, in compressing an ensemble of states as in Eq. (1.1), at the end of the protocol, Bob would be required to hold an approximation to the state ρ x and Alice a purification of this state.In contrast to these tasks, we do not require that the compression protocol maintain such coherence.More precisely, the registers containing a purification of the output state may be shared by Alice and Bob.Such compression protocols are more relevant in the context of two-party communication protocols studied in complexity theory, especially in the context of direct sum and direct product results (see e.g., Refs.[17,19,28] and the references therein).In communication complexity, a typical goal is to compute a bivariate Boolean function when the inputs are distributed between two parties.The parties communicate with each other, alternating messages with local computation, and at the end, one party produces the output of the protocol from the part of the final state in her possession.As a result, the output of the protocol does not depend on the part of the state held by the other party (i.e., on the purification of her part of the final joint state).A compression scheme for the final state then need only focus on the part being measured for the output.

Entanglement cost of compression
Jain, Radhakrishnan, and Sen [18,19] gave a one-shot protocol for compressing an ensemble of states as in Eq. (1.1), and bounded its communication cost by O(I(A : B) τ / 3 ), where I(A : B) τ is the mutual information between registers A and B in the state τ AB := x , and is the average approximation error (cf.Section 2.3 for a precise definition of average error).Using a more refined application of their technique, Bab Hadiashar, Nayak, and Renner [4] tightly characterized the communication cost of the task in terms of the smooth max-information, a one-shot entropic analogue of mutual information.Their results are stated for entanglement-assisted classical channels and use purified distance to quantify the approximation, but translate immediately to the setting here through the use of superdense coding [29, Section 6.3.1] and the Fuchs and van de Graaf Inequalities (Proposition 2.4).The upper bound so obtained is This is slightly better than that derived from protocols for state splitting in terms of the approximation error; it has an additive term of O(log log 1 ) for average error versus the additive term of O(log 1 ) in Ref. [1,Corollary 5].However, both these protocols use shared entanglement that may be much longer than the message itself, namely O(k(log 1 ) log m) qubits and O((1 + 1/ 2 ) log 2 (m/ )) qubits, respectively, where log 2 k = I(A : B) τ , and m is the dimension of the states in the ensemble.On the other hand, earlier protocols for state splitting [7, Lemma 3.5], with potentially larger communication cost, have entanglement cost bounded by log m.Since sharing entanglement also entails some communication, in addition to the preparation and storage of a potentially delicate high dimensional state, this motivates the question as to whether shared entanglement is truly necessary for compression, and if so, how much of it is needed.
For the more restrictive task of state splitting, it follows from the proof of the converse bound for one-shot entanglement consumption due to Berta, Christandl, and Touchette [8, Proposition 10] that the sum of the communication and entanglement costs is at least the min-entropy S min (ρ) of the ensemble average state ρ := x p x ρ x .(Although the proof is written assuming that the shared state consists of EPR pairs and some ancilla and an auxiliary error parameter, it may be modified to give a bound when an arbitrary state is shared and the auxiliary error is 0.) In this article, we show that there are ensembles for which the min-entropy bound equals the number of qubits in the states, and the bound holds up to an additive constant even with the more general compression protocols we allow.
Theorem 1.1.There exist universal constants c 1 , c 2 > 0 such that for any ∈ (0, 1), and any k, m ∈ N with k ≥ 6/(1 − ) and m ≥ c 1 (ln k)/(1 − ) 2 such that k divides m, there exists an ensemble , where n depends on k, m, and , such that (ii) there is a one-way protocol with shared entanglement for the visible compression of the ensemble with average error /2 and with communication cost 1 2 log k+O(log log 1 ); and (iii) the sum of communication and entanglement costs of any one-way protocol with shared entanglement for visible compression of the ensemble, with average-error at most /2, is at least In particular, the theorem implies that in the absence of shared entanglement, the ensemble may only be compressed by a constant number of qubits (independent of m), even if constant average error 2 < 1/2 is allowed.Note also that the straightforward protocol that prepares and sends the state ρ x on input x has sum of entanglement and communication costs equal to log m.So the lower bound in the theorem is optimal up to an additive universal constant term for constant ∈ (0, 1).
Proposition 3.4 and Corollary 3.5 in Section 3 contain more precise statements of the results stated in the theorem.As we explain in that section, I(A : B) τ may be interpreted as the "information content" of the ensemble; it is the quantum information cost [28] of the protocol in which Alice simply prepares the state ρ x on input x and sends the state to Bob.
The compression task we study is a relaxation of oblivious (or blind ) compression, in which the input to Alice is the state ρ x , rather than x.It is also a relaxation of state-splitting (more generally, of state re-distribution [10,23,30]), and channel simulation.So the lower bound in Theorem 1.1(ii) holds for these tasks as well.
The ensemble mentioned in Theorem 1.1 is obtained via the probabilistic method, and is of a form devised by Jain, Radhakrishnan, and Sen [17].They showed the incompressibility of such an ensemble when the decompression operation is unitary (i.e., via protocols as in Figure 1 in which the register B 1 is trivial).We adapt their proof method to protocols which allow a general quantum channel for decompression.A key step here is a technical lemma (Lemma 3.2 in Section 3) which allows us to reason about general quantum channels, and also yields a tighter lower bound on the sum of communication and entanglement costs.

Implications and related work
Jain et al. [18,19], also used the same kind of ensemble as in Theorem 1.1 to design a two-party one-way communication protocol with shared entanglement for the Equality function.They showed that the initial shared state in the protocol cannot be replaced by one with polynomially smaller dimension in a "black-box fashion" (i.e., when the local operations of the two parties are not modified).Theorem 1.1 implies a similar impossibility result for protocols in which the sender and receiver can deviate from the original protocol arbitrarily, but they try to approximate the receiver's state in the original protocol after the message is sent.The impossibility holds even when the dimension of the initial shared entangled state is reduced only by a constant factor.
A remarkable property of the ensemble posited by Theorem 1.1 is that the communication cost of compression (with shared entanglement) may be arbitrarily smaller than the entanglement cost.For constant error the communication cost is within an additive constant of the quantum information cost [28] of the protocol that simply prepares and sends the state.As a consequence, we infer that the quantum information cost of a protocol may be arbitrarily smaller than the communication cost of any protocol without shared entanglement for compressing its messages.Anshu, Touchette, Yao, and Yu [3] had previously proven a similar separation when the compression protocol is allowed to use shared entanglement.However, their separation is exponential: they exhibited an interactive protocol for a Boolean function with quantum information cost that is exponentially smaller than the communication cost of any interactive quantum protocol that computes the function.
(Observe that a protocol for compressing the final state of the original protocol may also be used to compute the function.)In contrast to that protocol, the one we present is compressible to its quantum information cost, but requires an arbitrarily larger amount of shared entanglement to do so.
In another related work, Liu, Perry, Zhu, Koh, and Aaronson [22] show that one-way protocols cannot be compressed to their quantum information cost without using shared entanglement.They consider a certain one-way protocol in which Alice gets an n-bit input, Bob gets an m-bit input, with m ∈ o(n).The protocol has quantum information cost O(nm −2 log m).They show that the protocol cannot be compressed by a oneway protocol without shared entanglement into a message of length o(log n) with error at most (n + 1) −m .Thus the separation is limited, and only holds for exponentially small error (in the length of the inputs).
It is believed that the communication in any interactive quantum protocol which has a constant number of rounds and computes a function of classical inputs may be compressed, with constant error, to an amount proportional to the quantum information cost of the protocol.For one-way protocols such a result was shown by Jain, Radhakrishnan, and Sen [18,19].This was later re-proven by Anshu, Jain, Mukhopadhyay, Shayeghi, and Yao [2] using different techniques.A similar result for protocols with a larger constant number of rounds of communication was claimed by Touchette [28], but the proof has an error.The compression protocols achieving quantum information cost all rely on the presence of shared entanglement.Theorem 1.1 shows that even for the simplest protocols, such compression is not possible in the absence of shared entanglement.Moreover, it shows that the entanglement cost may be necessarily within an additive constant of the length of the message to be compressed, even when the quantum information cost is arbitrarily smaller than the message length.
In a recent independent work, Khanian and Winter [20] analyse the communication and entanglement costs of a variant of compression in the asymptotic setting.They study pure state ensembles with quantum side information in the form of pure states.In the case of visible compression with shared entanglement, they show that the asymptotic (per-instance) communication cost is at least 1 2 S(ρ), i.e., half the entropy of the ensemble average state ρ.So this cost may be at most a factor of 1/2 smaller than that of compression without shared entanglement.Moreover, the asymptotic sum of communication and entanglement costs is at least the entropy S(ρ).Thus the kind of separation we show does not hold for pure states even in the asymptotic setting.
Organization.The rest of this article is organized as follows.In Section 2, we review basic concepts and notation from quantum information and communication.In section 3, we prove the main result and discuss its implications.
Acknowledgements.We thank Milán Mosonyi for extensive, thoughtful feedback on earlier versions of this article.This research is supported in part by NSERC Canada.SBH is also supported by an Ontario Graduate Scholarship.

Mathematical notation and background
We refer the reader to the book Watrous [29] for a thorough introduction to basics of quantum information.We briefly review the notation and some results that we use in the article.
For the sake of brevity, we denote the set {1, 2, . . ., k} by [k].We denote physical quantum systems ("registers") with capital letters, like X, Y and Z.The state space corresponding to a register is a finite-dimensional Hilbert space.We denote (finite dimensional) Hilbert spaces either by capital script letters like H and K, or as C m where m is the dimension.We denote the the dimension of a Hilbert space corresponding to a register X as |X|.We use the Dirac notation, i.e., "ket" and "bra", for unit vectors and their adjoints, respectively.We denote the set of all unit vectors in a Hilbert space H by Sphere(H).For a Hilbert space H := C S for some non-empty finite set S, we call {|x : x ∈ S} its canonical basis.
A subset N of Sphere(H) is called -dense if for every vector |u ∈ Sphere(H), there exists a vector in the set N at Euclidean distance at most from |v .Such a set is also called an " -net" in the literature.The following proposition states that every finite dimensional Hilbert space has a relatively small -dense set.We denote the set of all linear operators on Hilbert space H by L(H), the set of all positive semi-definite operators by Pos(H), the set of all unitary operators by U(H), and the set of all quantum states (or "density operators") over H by D(H).The identity operator on H is denoted by 1 H .We denote quantum states or sub-normalized states (positive semi-definite operators with trace at most 1) by lowercase Greek letters like ρ, σ.We use notation such as ρ X to indicate that register X is in state ρ, and may omit the superscript when the register is clear from the context.An operator M ∈ Pos(H) is called a measurement operator if M ≤ 1.We usually denote quantum channels, i.e., completely positive tracepreserving linear maps from the space of linear operators on a Hilbert space to another such space, by capital Greek letters like Ψ.The partial trace over a Hilbert space K is denoted as Tr K .
We We consider random unitary operators chosen according to the Haar measure η on U(H), where H is a finite dimensional Hilbert space.The Haar measure is the unique unitarily invariant probability measure over U(H).
for some κ ≥ 0. If κ is small enough as compared to the dimension of H, with high probability, the random variable f (U ) is close to its expectation, where U ∈ U(H) is a Haar-random unitary operator.This concentration of measure property is formalized by the following theorem, which is a special case of Theorem 5.17 in Ref. [25].

Theorem 2.2 ([25]
, Theorem 5.17, page 159).Let η be the Haar measure on U(H), where H is a Hilbert space with finite dimension m, and let U ∈ U(H) be a random unitary operator chosen according to η.For every function f : U(H) → R that is κ-Lipschitz with respect to the Frobenius norm (with κ > 0), and every positive real number t, we have The fidelity between two sub-normalized states ρ and σ is defined as Fidelity can be used to define a useful metric called the purified distance [12,27] between sub-normalized states: For a quantum state ρ ∈ D(H) and ∈ [0, 1], we define as the ball of sub-normalized states that are within purified distance of ρ.
The trace distance between quantum states is induced by the trace norm.Proposition 2.4 (Fuchs and van de Graaf Inequalities [11]).For any pair of quantum states ρ, σ ∈ D(H), Unless specified, we take the base of the logarithm function to be 2.
Let H, K, and M be the state spaces corresponding to registers X, Y , and M , respectively.For a register X in quantum state ρ ∈ D(H), the von Neumann entropy of X is defined as This coincides with the Shannon entropy of the spectrum of ρ.The relative entropy of two quantum states ρ, σ ∈ D(H) is defined as when supp(ρ) ⊆ supp(σ), and is ∞ otherwise.The max-relative entropy of ρ with respect to σ is defined as S max (ρ σ) := min{λ : ρ ≤ 2 λ σ} , when supp(ρ) ⊆ supp(σ), and is ∞ otherwise.The min-entropy of ρ is defined as Suppose that the registers X, Y are in joint state ρ XY ∈ D(H⊗K).The mutual information of X and Y is defined as When the state is clear from the context, the subscript ρ may be omitted from the notation.When ρ is a classical-quantum state, i.e., ρ XY = x p x |x x| X ⊗ ρ Y x with p being a probability distribution, {|x } the canonical orthonormal basis for H, and ρ x ∈ D(K), we have I(X : Y ) = When ρ XYM is a tensor product of the states ρ XM and ρ Y , we have For any state ρ XY ∈ D(H ⊗ K), the max-information register Y has about register X [7] is defined as For a parameter ∈ [0, 1], the smooth max-information register Y has about register X is defined as

Quantum communication protocols
We first describe a two-party quantum communication protocol informally and then give a formal definition for the special case of interest to us.We refer the reader to, e.g., Ref. [28] for a formal definition of the general case.
In a two-party quantum communication protocol, there are two parties, Alice and Bob, each of whom may get some input in registers designated for this purpose.Alice and Bob's inputs may be entangled with each other, and also with a "reference" system, which purifies it.Alice and Bob's goal is to accomplish an information processing task by communicating with each other.
Each party possesses some "work" (or "private") qubits (or registers) in addition to the input registers.The work qubits are initialized to a fixed pure state in tensor product with the input state.This fixed state may be entangled across the work registers of Alice and Bob, and may be used as a computational resource.In this case, we say the protocol or the channel is with shared entanglement or with entanglement assistance.If the fixed state is a tensor product state across Alice and Bob's registers, we say it is a protocol or channel without shared entanglement or simply unassisted .
The protocol proceeds in some number of "rounds".In each round, the sender applies an isometry to the qubits in her possession, and sends a sub-register (the message) to the other party.The length of the message (in qubits) is the base 2 logarithm of the dimension of the message register.After the last round, the recipient of the last message applies an isometry to his registers.The output of the protocol is the state of a pair of designated registers of the two parties at the end.
We are often interested in minimizing the total length of the messages over all the rounds, i.e., the communication cost (or complexity) of the protocol.The idea is to accomplish the task at hand with minimum communication.In protocols with shared entanglement, we are also interested in the amount of shared entanglement needed in the protocol, i.e., the minimum dimension of the support of the initial state of either party's work space.This latter quantity, measured in number of qubits, is called the entanglement cost of the protocol.
In this article, we study only one-way protocols, i.e., protocols with one round, and therefore one message, (say) from Alice to Bob.We describe these more formally here.We say that the input is "classical" when there are non-empty finite sets S A , S B (the sets of classical inputs) such that the Hilbert spaces corresponding to the input registers are C S A , C S B , respectively, and the initial joint quantum state in the input registers A in B in is diagonal in the canonical basis {|x |y : x ∈ S A , y ∈ S B }.In the case that the inputs to Alice and Bob are classical, we assume without loss of generality that the input registers A in and B in are "read-only", i.e., the isometries U and V are of the form , where S A , S B are sets as above.A one-way protocol in which Alice gets a classical input and Bob does not have any input is depicted in Figure 1.
Let Π be a one-way quantum protocol (with or without shared entanglement) with a single message from Alice to Bob, in which Alice gets a classical input and Bob does not have any input.The register R with the referee purifies Alice's input so that |ρ RA in := x∈S A √ p x |xx RA in , where p x is a probability distribution over the input set S A .Let M be the quantum register corresponding to the message in Π.The quantum information cost (or quantum information complexity) of the protocol Π is defined as where the registers are in the state immediately after Alice sends the message register M to Bob.This expression simplifies to I(R : M E B ) as the registers R, E B are in a tensor product state at this point.It is intended to measure the information Bob gains about Alice's input from the message.This notion requires a nuanced definition for protocols with more general inputs and with multiple rounds of communication.As it is not central to our work, we refer the reader to Ref. [28] for the definition for general protocols.

Compression of quantum states
We study one-way protocols for non-oblivious or visible compression of quantum states, which is typical for tasks of this nature (see, e.g., Ref. [1]).The protocol may be with or without shared entanglement.Suppose we wish to compress states chosen from an ensemble ((p x , ρ x ) : x ∈ S) for some finite set S, where p is a probability distribution over S and ρ x ∈ D(H).The ensemble is known to both parties.The sender, say Alice, is given a classical input x ∈ S chosen according to the distribution p. Alice and Bob execute a one-way protocol with a message from Alice to Bob in order to prepare an approximation of ρ x on Bob's side.Following the notation from Section 2.2, we interpret the state of the message register M of this protocol as a compression of ρ x .Suppose the state of the output register B out is ρ x .We say that the average error of the compression protocol is ∈ [0, 2] if the output state ρ x is -close in trace distance to the ideal state ρ x on average over the inputs x: x It is sometimes desirable to express the error in terms of the purified distance.For simplicity, we state error bounds in terms of trace distance; we may express the bounds in terms of purified distance via Proposition 2.4.
Note that a protocol for visible compression without shared entanglement may be characterized by a sequence of quantum states (σ x : x ∈ S) and a quantum channel Ψ.We let σ x be the state of the message register M sent by Alice to Bob on input x.We define Ψ as the channel resulting from the application of the isometry V followed by the tracing out of the register B 1 .The average error of the protocol is then x p x ρ x − Ψ(σ x ) tr .Conversely, any choice of states (σ x : x ∈ S, σ x ∈ D(K)) and quantum channel Ψ : L(K) → L(H) for some Hilbert space K defines a valid visible compression protocol.
An essentially equivalent formulation of the task of visible compression is the following (with the notation from Section 2.2).Consider the state τ over the registers RXA 1 C: is a purification of ρ x , register R is held by the referee, and registers XA 1 C together constitute Alice's input register A in .Alice and Bob both know the full description of τ .Their goal is to run a one-way quantum communication protocol with a message from Alice to Bob, with or without shared entanglement, such that at the end, the state τ of registers RB out is close to τ RC : The difference from state-splitting is that for a fixed state |x of register R, the purification of the state in register B out may be shared arbitrarily between Alice and Bob (while in state splitting, it is required to be held by Alice, in register A 1 ).A protocol for statesplitting can thus be used for this task, and conversely lower bounds on communication or entanglement costs derived for the above task applies to state-splitting as well.

The main result
In this section, we prove the main result of this article.

Two useful lemmas
We begin with two lemmas that we need for the result.The first allows us to focus on a finite number of subspaces of a finite dimensional Hilbert space, in the context of measurements.For an operator M ∈ L(H), and a subspace A of H, define the semi-norm | w|M |w | .
Lemma 3.1 ([17], Lemma 6).Let d and q be positive integers with q ≥ d, δ > 0 be a real number, and H be an q-dimensional Hilbert space.There exists a set T of subspaces of H of dimension at most d such that , and 2. for every d-dimensional subspace A ⊆ H, there is a subspace B ∈ T such that for every measurement operator M ∈ Pos(H), The set T in the lemma is obtained as follows.We fix an -dense subset S of Sphere(H) for a suitably small value of , as given by Proposition 2.1.For any d-dimensional subspace A, we consider an orthonormal basis, and the d vectors in S closest to the respective elements in the basis.We include in T the subspace B spanned by the d vectors from S so obtained.
By a uniformly random subspace of dimension of an m-dimensional Hilbert space H, with ≤ m, we mean the image of a fixed -dimensional subspace under a Haar-random unitary operator on H.The next lemma is similar to Lemma 7 from Ref. [17], and is stronger in several respects.It enables the generalization of the incompressibility result in Ref. [17] that we prove, and helps us derive tighter bounds for compression.Informally, the lemma states that every state in a "small enough" subspace of a bi-partite space has, with high probability, a small projection onto a "small enough" random subspace of one part.
Lemma 3.2.Let m, d, , and p be positive integers such that ≤ m.Let W be a fixed d-dimensional subspace of C m ⊗ C p .Let Z be a uniformly random subspace of C m of dimension , and M be the orthogonal projection operator onto Z. Then for any real number α > 2, there is a real number α 1 > 0 that depends only on α such that We may take α 1 := (α−2) 2 768 in the above statement.
that is a α 2m -dense set of Sphere(W).
Note that for any two vectors |u , |v ∈ Sphere(C m ⊗ C p ), we have . By the Union Bound, we get Consider any fixed vector |v ∈ N and let P ∈ Pos(C m ) be a fixed orthogonal projection of rank .Consider the function f : U(C m ) → R defined as For any U, W ∈ U(C m ), we have Let U ∈ U(C m ) be a Haar-random unitary operation.The expectation of f (U ) is: Since U P U * and M have the same distribution, by Theorem 2.2 we get By Eq. (3.1), we get provided the m, , d, α satisfy the stated condition.

The ensemble and its compressibility
We study an ensemble of the same form as in Ref. [17].For positive integers n, m, k such that k divides m and n, let B i = (|b i1 , |b i2 , . . ., |b im ) be a suitably chosen orthonormal basis for C m , for each i ∈ n k .Let (B ij : j ∈ [k]) be a partition of B i into k equal size sets.Define ρ ij := k m |v ∈B ij |v v|.We show that there is a choice of bases such that the ensemble cannot be compressed significantly in the absence of shared entanglement.The following theorem, which we prove along the same lines as Theorem 5 in Ref. [17], contains the crux of the argument.
Proof: We use the Probabilistic Method to show the existence of an ensemble with the claimed property.We first derive a simpler property that suffices.
For i ∈ n k and j ∈ [k], let τ ij ∈ D(C m ) be m-dimensional quantum states and M ij be the orthogonal projection onto the support of τ ij .By Proposition 2.3, the condition

So we have
and Eq.(3.4) is equivalent to For a fixed unitary operator U , for any i, j, the state where X := U (A⊗| 0 ) is a fixed d-dimensional subspace of A⊗B ⊗C.Thus, the expression on the left in Eq. (3.5) is bounded by M ij ⊗ 1 A⊗B X for every i, j.So it suffices to exhibit an ensemble such that for all d-dimensional subspaces W ⊆ A ⊗ B ⊗ C, By Lemma 3.1, for any ν > 0, there is a collection T of subspaces of A ⊗ B ⊗ C of dimension at most d, such that size |T| ≤ (8 √ d/ν) 2d 2 m 2 , and for all subspaces W as above, there is a subspace Y ∈ T such that for all i, j, Taking ν < 1 − 2 , it suffices to produce an ensemble such that for all subspaces Y ∈ T, We pick bases B i independently and uniformly at random, i.e., for each i, independently pick a Haar-random unitary operator on C m , and let B i be the basis defined by its columns.Partition B i into k sets (B ij : j ∈ [k]) of equal size.We then define an ensemble of the form in Eq. (3.2) with ρ ij := k m |v ∈B ij |v v|, and the corresponding projection operators M ij := |v ∈B ij |v v|.We show that with non-zero probability, the operators M ij satisfy Eq. (3.6) for all Y ∈ T, by bounding the probability of the complementary event.
Suppose the operators M ij do not satisfy Eq. (3.6) for some subspace Y ∈ T. Then Equivalently, there are at least In particular, there are at least (1 − β)n/k indices i such that there is at least one For convenience, by E i (Y) we denote the event that there is some j ∈ and by I(Y), we denote the subset of indices i ∈ n k such that E i (Y) occurs.
Let q := (1 − β) n k .By the above reasoning, it suffices to bound the probability that for some subspace Y ∈ T, the subset I(Y) has at least q indices.By Lemma 3.2, for a fixed subspace Y and pair i, j, So by the Union Bound and by the Union Bound and the independence of M ij for distinct indices i, Finally, we get when m > max 3 γ ln e 1−β , 3 γ ln k , and This proves the theorem.
Note that the above proof considers an arbitrary choice of states σ ij and quantum channel Ψ after the ensemble is chosen randomly.Together, the sequence (σ ij ) and the channel Ψ constitute a compression protocol.The proof shows that no matter how (σ ij ) and Ψ are chosen, the error due to the corresponding compression protocol is large if the dimension d is much smaller than m (provided n is chosen properly).

Application to entanglement cost
Consider a one-way protocol Π in which with probability 1/n, Alice gets input (i, j), prepares state ρ ij as in an ensemble given by Theorem 3.3, and sends it to Bob.The ensemble average ρ is the completely mixed state 1 m over C m .By construction, we have S(ρ ij ρ) = log k, and therefore QIC(Π) = 1 2 log k.In fact, we have S max (ρ ij ρ) = log k.Theorem I.1(1) of Ref. [4] gives us a protocol for the visible compression of any such ensemble of states using classical communication and shared entanglement, with error .The communication cost of this protocol is This bound is an additive term of O(log log 1 ) more than QIC(Π).Theorem I.1(1) in Ref. [4] also gives a lower bound of (1/2) I √ max (A : B) τ on the communication cost, which is at least (1/2) log k−2 for ≤ 1/81 (see Proposition A.1 in the appendix).So for constant , the upper bound in Proposition 3.4 is close to optimal as a function of k.It is slightly better than those obtained from protocols for state splitting (see, e.g., Ref. [1,Corollary 5]), which have an additive term of order log 1 .However, the protocol from Ref. [4] has entanglement cost of order k(log 1 ) log m, which is exponential in the communication cost, while the protocol for state splitting with the least known communication cost [1, Corollary 5] has entanglement cost of order (1 + 1/ 2 ) log(m/ ).
Next we consider how small the entanglement cost of the visible compression of an ensemble (ρ ij ) given by Theorem 3.3 may be.By choosing the parameters in the statement of Theorem 3.3 appropriately, we get the following lower bound on the sum of communication and entanglement costs of any compression protocol.

Corollary 3.5.
There exist universal constants c 1 , c 2 , c 3 > 0 such that for any ∈ (0, 1) and any positive integers k, m, n with m and n divisible by k, there is an ensemble of n equally likely quantum states in D(C m ) of the form in Eq. (3.2) for which any (one-shot) one-way protocol for compressing the states with average error at most 2 , the sum of the communication and entanglement costs is at least ) In particular, the entanglement cost of any such protocol with optimal communication cost is at least and the communication cost of any such protocol without entanglement is at least the bound given in Eq. (3.8).
We defer the proof of this corollary to the appendix.
Note that the parameter m may be chosen arbitrarily larger than k, provided the number of states n in the ensemble is chosen large enough.Thus, we see that there are ensembles with m-dimensional states for which communication-optimal compression protocols with shared entanglement and with constant average error, say 1/4, have entanglement cost almost as large as log m.In particular, the number of qubits of shared entanglement needed may be arbitrarily larger than the quantum information cost of the original protocol.We also see that in the absence of shared entanglement, there are ensembles with mdimensional states that cannot be compressed to states with dimension smaller than cm with average error less than 1/4, where c is a universal positive constant.In particular, the optimally compressed message may be arbitrarily longer than the quantum information cost of the protocol Π.
Corollary 3.5 shows that the number of qubits of shared entanglement used by protocol with the smallest known communication cost, due to Anshu and Jain [1, Corollary 5], is optimal up to a constant multiplicative factor and an additive log k term (for constant error in compression).The lower bound on entanglement cost given in the corollary may be achieved by protocols derived from those for state splitting, up to an additive term of 1 2 log k + O(1), again for constant error (see, e.g., Ref. [7,Lemma 3.3]).However, the communication cost of these protocols may not be optimal.
The probabilistic construction in the results above gives us ensembles with a number of states n that is polynomial in m and k.Note that in the compression protocol Π , Alice may send the input (i, j) as her message, in which case the message register has dimension n.Similarly, she may send the state ρ ij itself, and this has dimension m.So in order to study how much compression is truly possible (i.e., how much smaller the dimension of the message register may be as compared with m), we have to study ensembles with n ≥ m states, and compression protocols with message registers with dimension at most m.Further, consider any protocol Γ (similar to Π) in which Alice receives a random input x out of n possibilities according to some distribution, prepares a state ω x and sends it to Bob.The quantum information cost of such a protocol Γ is at most 1 2 log n.So the polynomial dependence of n on the dimension of the states in the ensemble (m in the construction above) and the exponential dependence of n on the quantum information cost of the corresponding protocol ( 1 2 log k in the construction) is inevitable.

Concluding remarks
In this article, we revisited one-shot compression of an ensemble of quantum states.We proved that there are ensembles which cannot be compressed by more than a few qubits in the absence of shared entanglement, when allowed constant error.In the presence of shared entanglement, the ensemble can be compressed to many fewer qubits.However, the entanglement cost may not be smaller than the number of qubits being compressed by more than a constant, for constant error.Since we study compression protocols that are allowed to make some error, the bounds we establish are robust to perturbations to the shared entangled state that are sufficiently small relative to the error.
Entanglement and quantum communication are distinct resources in the context of information processing.Sharing entanglement involves the generation, distribution, and storage of a state that is independent of the input for the task at hand.Communication also involves the same steps, but may be dynamic, i.e., may depend on the input and the prior history of the communication protocol.Consequently, any physical implementation of these resources is likely to incur different costs for these steps.In this work, we focused on the cost of distributing quantum states, and as a first stab, assumed that the cost of distribution for shared entanglement or for communication is proportional to the number of qubits involved.Formally, this corresponds to the notion of smooth 0-Rényi entropy.The motivation for this focus comes largely from the area of communication complexity [21], in which the interaction between multiple processors takes centre stage, but shared entanglement is often taken for granted.Our result shows that entanglement plays a crucial role in important communication tasks and highlights the need for considering entanglement cost in addition to communication cost.
A question of interest, from a theoretical perspective, is the degree or strength of entanglement required for different information processing tasks.Several different measures of entanglement have been studied in the literature, depending on the context.Smooth 0-Rényi entropy is a very coarse measure in this respect, as it may be the same for states that are regarded as having widely different degrees of entanglement.A natural question is whether results such as the ones we derived also hold for other definitions of entanglement cost that capture the degree of entanglement more satisfactorily.We conjecture that analogous results hold also for other measures, and leave this to future work.
Many other questions surrounding compression remain open.For instance, we do not have tight characterizations for the communication and entanglement costs of one-shot state re-distribution.Even lesser is known for the one-shot compression of interactive quantum protocols.Progress on these questions might hold the key to resolving important questions in communication complexity as well.If q ij > 3/2n or q ij < 1/2n, we have |q ij − p ij | > 1/2n.So for at least (1 − 2ξ)n pairs (i, j), we have 1/2n ≤ q ij ≤ 3/2n, and we call such pairs (i, j) typical .
Eq. (A.1) may be written as ij q ij ρ ij − p ij ρ ij tr ≤ ξ , so, by monotonicity of trace distance, where B ij is as in the definition of the ensemble (ρ ij ).In particular, There are at least (1 − 2ξ)n/k indices i ∈ [n/k] such that there is a typical pair (i, j) for some j ∈ [k].Let S be the set of such indices i.Let η ∈ (0, 1).If for all indices i ∈ S, there are less than (1 − η)m pairs (j, v) with (i, j) typical, |v ∈ B ij , and then we would have Taking η := 2ξ/(1 − 2ξ), we see that this is in contradiction with Eq. (A.2).So there is an index i ∈ S such that there are at least (1 − η)m pairs (j, v) with j ∈ [k] and |v ∈ B ij such that (i, j) is typical, and (i, j, v) satisfy Eq. (A.3).Denote such an index i by i 0 , and let T := (j, v) : j ∈ [k], |v ∈ B i 0 j , (i 0 , j) typical , (i, j, v) satisfy Eq. (A.3) .

|x < l a t e x i t s h a 1 _
b a s e 6 4 = " K j n 8 2 N s h D b E r S u r 1 Q 2 q f Y 5 Q Z j s Y = " > A A A B 5 n i c b V D L S g M x F L 1 T X 2 1 9 V d 0 I b o J F c F V m 6 s I u i 2 5 c V r A P a U v J p G k b m s k M y R 2 x 1 o J f 4 E b E j Y L f 4 F f 4 C 4 I f Y / r Y t P X A h c M 5 5 5 J 7 4 k d S G H T d H y e x s r q 2 v p F M p T e 3 t n d 2 M 3 v 7 F R PG m v E y C 2 W o a z 4 1 X A r F y y h Q 8 l q k O Q 1 8 y a t + / 3 L s V + + 4 N i J U N z i I e D O g X S U 6 g l G 0 0 u 3 j f U N T 1 Z W 8 l c m 6 O X c C s k y 8 G c k W D x 9 + U 0 9 f F 6 V W 5 r v R D l k c c I V M U m P q n h t h c 0 g 1 C i b 5 K N 2 I D Y 8 o 6 9 M u H 0 7 O H J E T K 7 V J J 9 R 2 F J K J O p e j g T G D w L f J g G L P L H p j 8 T + v H m O n 0 B w K F c X I F Z s + 1 I k l w Z C M O 5 O 2 0 J y h H F h C m R b 2 Q s J 6 V F O G 9 m f S t r q 3 W H S Z V P I 5 7 y y X v / a y x Q J M k Y Q j O I Z T 8 O A c i n A F J S g D g w B e 4 B 0 + n J 7 z 7 L w 6 b 9 N o w p n t H M A c n M 8 / G A G P a g = = < / l a t e x i t > |x < l a t e x i t s h a 1 _ b a s e 6 4 = " K j n 8 2 N s h D b E r S u r 1 Q 2 q f Y 5 Q Z j s Y = " > A A A B 5 n i c b V D L S g M x F L 1 T X 2 1 9 V d 0 I b o J F c F V m 6 s I u i 2 5 c V r A P a U v J p G k b m s k M y R 2 x 1 o J f 4 E b E j Y L f 4 F f 4 C 4 I f Y / r Y t P X A h c M5 5 5 J 7 4 k d S G H T d H y e x s r q 2 v p F M p T e 3 t n d 2 M 3 v 7 F R P G m v E y C 2 W o a z 4 1 X A r F y y h Q 8 l q k O Q 1 8 y a t + / 3 L s V + + 4 N i J U N z i I e D O g X S U 6 g l G 0 0 u 3 j f U N T 1 Z W 8 l c m 6 O X c C s k y 8 G c k W D x 9 + U 0 9 f F 6 V W 5 r v R D l k c c I V M U m P q n h t h c 0 g 1 C i b 5 K N 2 I D Y 8 o 6 9 M u H 0 7 O H J E T K 7 V J J 9 R 2 F J K J O p e j g T G D w L f J g G L P L H p j 8 T + v H m O n 0 B w K F c X I F Z s + 1 I k l w Z C M O 5 O 2 0 J y h H F h C m R b 2 Q s J 6 V F O G 9 m f S t r q 3 W H S Z V P I 5 7 y y X v / a y x Q J M k Y Q j O I Z T 8 O A c i n A F J S g D g w B e 4 B 0 + n J 7 z 7 L w 6 b 9 N o w p n t H M A c n M 8 / G A G P a g = = < / l a t e x i t > U < l a t e x i t s h a 1 _ b a s e 6 4 = " W p b D V N F L u 2 k g M G u f o a 4 z 4 R g w j d A = " > A A A B 3 n i c b V B N S 0 J B F L 3 P v s y + r J Z B D E n Q S t 7 T R e 4 S 2 r R U 6 K m g Y v P G + 3 R w 3 g c z 8 w I R l 7 W J a F P Q T 3 H V u r / Q b + h P N H 5 s 1 A 5 c O J x z L n P P e L H g S t v 2 j 5 X a 2 N z a 3 k n v Z v b 2 D w 6 P s s c n N R U l k q H L I h H J h k c V C h 6 i q 7 k W 2 I g l 0 s A T W P c G t 1 O / / o h S 8 S i 8 1 8 M Y 2 w H t h d z n j G o j V d 1 O N m f n 7 R n I O n E W J H f z N a n + P p 1 P K p 3 s d 6 s b s S T A U D N B l W o 6 d q z b I y o 1 Z w L H m V a i M K Z s Q H s 4 m p 0 3 J p d G 6 h I / k m Z C T W b q U o 4 G S g 0 D z y Q D q v t q 1 Z u K / 3 n N R P u l 9 o i H c a I x Z P O H / E Q Q H Z F p V 9 L l E p k W Q 0 M o k 9 x c S F i f S s q 0 + Z G M q e 6 s F l 0 n t U L e K e Y L V S d X L s E c a T i D C 7 g C B 6 6 h D H d Q A R c Y I L z A O 3 x Y D 9 a z 9 W q 9 z a M p a 7 F z C k u w P v 8 A Q o K M x w = = < / l a t e x i t > V < l a t e x i t s h a 1 _ b a s e 6 4 = " 5 3 u 3

s 6 A
C y 5 A D d y A B m g B D B L w A J 7 A s 3 V v P V o v 1 u t s d M W a 7 x y C P 7 D e f w B D Q 5 c w < / l a t e x i t > A out < l a t e x i t s h a 1 _ b a s e 6 4 = " N 7

6 9 F
6 s V 7 n o w V r s X M M / s B 6 / w E y X J e 7 < / l a t e x i t > M < l a t e x i t s h a 1 _ b a s e 6 4 = " Q R a 0 B e N 8 p M F 5 p Y U k D A 8 F w b e m F g I e n Y 9 l a 8 7 J Z k 7 h D 5 z P H 6 a v j N Y = < / l a t e x i t > E A < l a t e x i t s h a 1 _ b a s e 6 4 = " d R b o N T l m w 8 6 S 8 P R M M 9 o o p e B s 1 w g

r e 2 d 4 q 7 p b 3 9 g 8 O 1 <
j 8 v F J R 8 e p Y t h m s Y h V N 6 A a B Z f Y N t w I 7 C Y K a R Q I f A w m N 3 P / 8 Q m V 5 r F 8 M N M E / Y i O J A 8 5 o 8 Z K 9 9 c D b 1 C u u F V 3 A b J O v J x U I E d r U P 7 q D 2 O W R i g N E 1 T r n u c m x s + o M p w J n J X 6 q c a E s g k d Y c 9 S S S P U f r Y 4 d U Y u r D I k Y a x s S U M W 6 u + J j E Z a T 6 P A d k b U j P W q N x f / 8 3 q p C R t + x m W S G p R s u S h M B T E x m f 9 N h l w h M 2 J q C W W K 2 1 s J G 1 N F m b H p l G w I 3 u r L 6 6 R T q 3 r 1 a u 2 u X m k 2 8 j i K c A b n c A k e X E ET b q E F b W A w g m d 4 h T d H O C / O u / O x b C 0 4 + c w p / I H z + Q O 2 d 4 1 k < / l a t e x i t > B l a t e x i t s h a 1 _ b a s e 6 4 = " j / R H p Y p w p O c q B 4 i 4 p S C y 0 P c + A h 8 = " > A A A A A H i c b Z D L S g M x F I b P e K 2 j 1 a p L N 8 F S c F U m R b D L o i A u K 9 o L t E P J p J k 2 N J M Z k o x Q h j 6 C G x e K u B U f x E d w 5 9 u Y X h b a + k P g 4 / / P I e e c I B F c G 8 / 7 d t b W N z a 3 t n M 7 7 u 5 e

Proposition 2 . 1 (
[24], Lemma 13.1.1,Chapter 13).Let ∈ (0, 1], and m be a positive integer.The Hilbert space C m has an -dense set N of size |N | ≤ 4 2m .A slightly better bound 1 + 2 2m on the size of an -dense set is given in Ref.[26, Lemma  2.6]. denote the operator norm (Schatten ∞ norm) of an operator M ∈ L(H) by M , the Frobenius norm (Schatten 2 norm) by M F , and the trace norm (Schatten 1 norm) by M tr .Recall that M tr := Tr √ M * M is the sum of the singular values of M , M is the largest singular value, and M F := Tr(M * M ) is the 2 -norm of the singular values with multiplicity.All of these norms are invariant under composition with a unitary operator.

x
p x S(ρ x ρ) , where ρ = x p x ρ x .Suppose the registers X, Y, M are in joint (tripartite) state ρ XYM ∈ D(H ⊗ K ⊗ M).The conditional mutual information of X and M given Y is defined as I(X : M | Y ) := I(XY : M ) − I(Y : M ) .

. 4 )
Consider the following Stinespring representation [29, Corollary 2.27, Sec.2.2] of the quantum channel Ψ : L(C d ) → L(C m ) in terms of a unitary operation U ∈ U(A ⊗ B ⊗ C) and a fixed pure state | 0 ∈ B ⊗ C, with A = C d , B = C = C m :

I / √ 2 maxProposition 3 . 4 . 1 2
(A : B) τ + O(log log(1/ )) , where τ AB := 1 n ij |ij ij| A ⊗ ρ B ij and we have used Proposition 2.4 to translate between purified and trace distance.This expression is bounded from above by log k + O(log log 1 ), since S max (ρ ij ρ) (and therefore I max (A : B) τ ) equals log k.Using superdense coding [29, Section 6.3.1],we get a bound on the quantum communication cost of compressing the ensemble with entanglement assistance.For any positive integers k, m, n such that k divides m and n, and error parameter > 0, any ensemble of n equally likely quantum states in D(C m ) of the form in Eq. (3.2) there is a one-shot one-way protocol with shared entanglement for compressing the states with quantum communication at most log k + O(log log 1 ) , with average error at most in trace distance.
Figure1: A one-message protocol for compression of quantum states, with shared entanglement.The register A in holds the input given to Alice, and E A contains Alice's workspace and her part of the initial shared state (the shared entanglement).The register E B contains Bob's workspace and his part of the initial shared state.The compression is implemented by the isometry U , and the register M contains the compressed state and is sent as the message.The decompression is implemented by the isometry V .Bob's output is contained in the register B out .
14]position 2.314]).For any pair of quantum states ρ, σ ∈ D(H),ρ − σ tr = 2 max { |Tr(M ρ) − Tr(M σ)| : M is ameasurement operator on H} .Purified distance and trace distance are related to each other as follows (see, e.g., Ref. [29, Theorem 3.33, page 161]): Alice and Bob initially hold registers A in E A and B in E B , respectively.The input registers A in B in are initialized to some state ρ A in B in whose purification is held in register R with a third party, the referee.Alice and Bob's work registers E A and E B are initialized to a pure state |φ E A E B , which may be entangled across the partition E A E B .The local operations in the protocol are specified by two isometries U and V .The isometry U acts on registers A in E A and maps them to registers A out A 1 M .The isometry V acts on registers B in E B M and maps them to registers B 1 B out .First, Alice applies U to the registers A in and E A and sends the register M to Bob.Then, Bob applies V on his initial registers B in E B and the message M .The output of the protocol is the state of Alice and Bob's registers A out B out .The communication cost of this protocol is log |M | and the entanglement cost is the logarithm of the Schmidt rank of the state |φ across the partition E A E B .We say it is a protocol with shared entanglement if the Schmidt rank of |φ is more than 1, and say that it is without shared entanglement otherwise.Such protocols are also called entanglement-assisted and unassisted , respectively, in the literature.