The Round Complexity of Local Operations and Classical Communication (LOCC) in Random-Party Entanglement Distillation

A powerful operational paradigm for distributed quantum information processing involves manipulating pre-shared entanglement by local operations and classical communication (LOCC). The LOCC round complexity of a given task describes how many rounds of classical communication are needed to complete the task. Despite some results separating one-round versus two-round protocols, very little is known about higher round complexities. In this paper, we revisit the task of one-shot random-party entanglement distillation as a way to highlight some interesting features of LOCC round complexity. We first show that for random-party distillation in three qubits, the number of communication rounds needed in an optimal protocol depends on the entanglement measure used; for the same fixed state some entanglement measures need only two rounds to maximize whereas others need an unbounded number of rounds. In doing so, we construct a family of LOCC instruments that require an unbounded number of rounds to implement. We then prove explicit tight lower bounds on the LOCC round number as a function of distillation success probability. Our calculations show that the original W-state random distillation protocol by Fortescue and Lo is essentially optimal in terms of round complexity.


Introduction
Quantum entanglement is an essential ingredient for realizing the full capabilities of distributed quantum information processing.To understand the research character of entanglement, one typically considers the "distant-lab" paradigm in which spatially sepa-rated laboratories have access to different parts of a globally-entangled states, but they can only process it using local operations and classical communication (LOCC) [40,25].In this scenario, parties take turns performing a local quantum measurement, announcing the outcome, and choosing a new local measurement for the next round based on the global history of previous results.Every LOCC protocol then has a tree-like structure with each branch in the tree representing a different sequence of measurement outcomes.Unfortunately, for many tasks such as entanglement distillation [4,20] or state discrimination [39,3,28,5,10], precise bounds on LOCC capabilities are hard to prove.One reason for this difficulty is that a potentially unbounded number of classical communication exchanges are allowed in an LOCC protocol.
The largely unexplored topic of LOCC round complexity studies how many rounds of classical communication are needed to perform some distributed quantum information task.We will say that an LOCC protocol has r rounds if there is at least one branch having a sequence of local measurements alternating r times between different parties (see Section 5.1 for a slightly more general definition).Note that each local measurement might involve a combination of measurements all performed in the same laboratory.Most results on LOCC round complexity involve quantum state discrimination problems and focus just on separating one-round versus two-round success rates [38,34,10,11,17,43,50,49]. Much more challenging is proving that a given task requires more than two rounds of LOCC to complete, and only a few results are known of this form.Wakakuwa et al. have proven separation results between two and three rounds for the task of entanglement-assisted nonlocal gate implementation [45,46].For higher rounds, Wang and Duan have constructed families of bipartite quantum states hav-ing increasing dimension but requiring more rounds to perfectly discriminate the states as the dimension grows [48].A similar finding has been shown for the convertibility of certain quantum (resp.classical) mixed states [12].On the other hand, some round compression results are known.For example, in the task of LOCC state conversion, any bipartite protocol can always be reduced to one-way LOCC provided the initial state is pure, regardless of the system's dimensions [32,35,44].In another direction, Cohen has shown that certain LOCC tasks of high round complexity can be approximated arbitrarily well using one-way LOCC [15].Complementing the work on round or "depth" compression, it has recently been shown that the "width" of an arbitrary LOCC state discrimination protocol can be compressed to a standard form [30].One could in general study the interplay between depth and width complexities in different LOCC tasks.
The purpose of this work is to further advance the study of LOCC round complexity.To exemplify certain new facts about round complexity, we focus on the specific task of random-party entanglement distillation in three-qubit W-class states [22], which has already proven to be a fruitful problem for demonstrating interesting properties of LOCC [18,7,9,8].A W-class state |Ψ W is one that can be obtained from the canonical W state |W = 1 √ 3 (|100 + |010 + |001 ) by stochastic LOCC (SLOCC) [21].A random-party EPR distillation protocol transforms a tripartite W-class state |Ψ W into a bipartite EPR state |Ψ + = 1 √ 2 (|01 + |10 ) with the target pair unspecified; any branch in the protocol is deemed a success branch provided |Ψ + is obtained between some pair of parties at the end of the branch.Fortescue and Lo devised a family LOCC protocols with increasing round number (details given below) that completes this task with success probabilities approaching one.The limit of such protocols is then some map that distills |W into random-party EPR pairs with probability one.However, it has been shown [9] that this map or any map achieving unit success probability is not implementable by LOCC.Consequently, LOCC constitutes a class of quantum operations that is not closed.This argument was later extended in Ref. [13] to show that the set of bipartite LOCC maps is likewise not topologically closed.Related topological properties have been reported in Ref. [6,15,16], and we also note that an experimental demonstration of random distillation has recently been performed using a three-photon W-state [31].
Variations to the random-party EPR distillation task can also be considered.For example, one can demand that the final states obtained in each success branch be just some entangled bipartite state |φ , and not necessarily a maximally entangled one [23,7].If E is some measure of bipartite entanglement, the goal then is to maximize the average final value E across all success branches.More precisely, we define where the supremum is taken over all finite-round LOCC transformations that convert |Ψ W into some bipartite state |ϕ i with probability p i .A convenient entanglement measure for bipartite systems is the concurrence [47], which for a pure state |ϕ XY is defined as (Ψ W ) is equal to the supremum probability of obtaining an EPR pair between any two parties starting from |Ψ W , and so the optimization in Eq. (1) can be restricted to LOCC transformations with only EPR states being the target, i.e. the original task of random-party EPR distillation.
A second variation to the random-party EPR distillation task is to fix one of the parties ( ) and only deem the transformation a success if party is entangled in the end.We refer to this as -randomparty distillation, as opposed to total random-party distillation described above.For a fixed party ∈ {A, B, C}, the optimal average bipartite entanglement for party is where the supremum is taken over all LOCC transformations in which each state |ϕ i is bipartite entangled between party and some other party in {A, B, C} \ { }.Restricted random-party distillation has been partially studied for the concurrence and E 2 entanglement measures in Refs.[13] and [8], respectively.
In terms of round complexity, one of our main results is that E ( -rnd) 2 (Ψ W ) is achievable in finite rounds of LOCC for any tripartite W-state, whereas C ( -rnd) (Ψ W ) is not.Hence, the amount of rounds needed in an optimal entanglement distillation protocol depends on the type of entanglement measure considered.What makes this result particularly surprising is that concurrence and E 2 measures are in one-to-one correspondence: Our second main result is establishing a tight lower bound on the number of LOCC rounds needed to achieve a random-party EPR distillation of |W with probability > 1 − δ, for any δ > 0.
To our knowledge, this is the first time any type of trade-off has been obtained between round complexity and success probability of a given distributed quantum information processing task.This result also shows that the original Fortescue-Lo protocol is essentially optimal for random-party distillation of |W , and we prove optimality of more general types of transformations by enlarging the class of LOCC to encompass all operations that completely preserve positivity of the partial transpose (PPT) [41].As a corollary of our work, we introduce a new family of quantum instruments that lie on the boundary of LOCC but not inside LOCC itself [13].

The Structure of W-class States
In this paper, we refer to the W-class as the collection of three-qubit states that can be obtained from the canonical W state |W by SLOCC.Up to a local change in basis, every W-class state can be written as with the x i being non-negative.Normalization requires that x 0 = 1 − (x A + x B + x C ), and therefore the state |x only depends on three non-negative parameter x = (x A , x B , x C ).If all three of these numbers are strictly positive, then the state is genuinely three-way entangled (i.e. it is not a product state with respect to some bi-partition of the three parties).Moreover, in this case it can be shown that the vector x uniquely identifies the state [27].More precisely, if we express |x in some other basis as then necessarily x k = x k for all components.Equivalently stated, the components x k are invariant under local unitaries (LU).This LU invariance, combined with the fact that the W-class is closed under LOCC, makes the W-class very attractive for studying properties of LOCC.Indeed, if Alice, Bob, and Charlie start out sharing a W-class state |x , then any multi-outcome LOCC protocol on |x can be described compactly by a probabilistic transformation of vectors, in which the |y i are different W-class states obtained along different branches of the protocol.
Every bipartite entangled state belongs to the W-class, and in canonical form, they have one and only one of the coordinates {x A , x B , x C } equaling zero.For example, |ϕ = √ x 0 |000 + √ x A |100 + √ x B |010 is a bipartite entangled state shared between Alice and Bob whenever x A , x B > 0. Its concurrence is readily computed to be (see Appendix A) From this we see that the concurrence is homogeneous under multiplication by a non-negative scalar.That is, C(αϕ) = αC(ϕ) for any α ≥ 0. We also observe that |ϕ is maximally entangled iff x A = x B = 1 2 and x 0 = 0.
In an LOCC protocol, each party takes turns measuring their local system and announcing the result.Since each party holds a qubit system, every local measurement can be described by a set of 2 × 2 Kraus operators {M i } i .Without loss of generality, we can assume that each M i has been brought into upper triangular form by a local unitary.Indeed, if U i M i is upper triangular, then we can always append U † i to the start of the next round of local measurement to recover the action of the original Kraus operator M i .Hence we will write the Kraus operators of every local measurement as When party k performs a local measurement {M i } i on the W-state |x and obtains outcome i, then the post-measurement state has undergone a coordinate transformation x j → y i,j = a i p i x j for j = k or 0, (7) where Observe the monotonicity conditions [27] i In other words, the component of the measuring party is non-increasing on average where as the components of the non-measuring parties remain unchanged on average.
An important property of any local measurement is that it can be decomposed into a sequence of weak measurements [36].Hence, we can envision a given LOCC W-state conversion as a tree of smooth trajectories in the positive quadrant of R 3 taking the initial state x to its possible target states {y i } i .Consequently, if x k → y i,k is the component transformation for party k along one branch of some protocol, then there exists an equivalent sequence of protocols that passes through any desired coordinates between x k and y i,k .Finally, we introduce three important functions of three-qubit W-class states that play an important role in this work.For coordinates (x A , x B , x C ) of state |x , let n 1 , n 2 , n 3 ∈ {A, B, C} be distinct party labels such that x n1 ≥ x n2 ≥ x n3 and let n i (x) := x ni .Suppose also that x n2 > 0 so that |x is not a product state.Then define With a slight abuse of the notation, we introduce one additional function.Let ∈ {1, 2, 3} be any fixed party, and take n 1 , n 2 ∈ {A, B, C} \ { } as distinct party labels such that x n1 ≥ x n2 > 0. Then define It has been shown that η(x), κ(x), and ζ (x) are entanglement monotones when manipulating Wclass states under LOCC [9,8,13]; i.e.

η(x)
for any LOCC transformation |x → {p i , |y i }, and likewise for κ(x) and ζ (x).Notice that since the W-class is closed under LOCC [21], |x → {p i , |y i } describes the most general type of pure-state transformations possible by LOCC and hence we have monotonicity under arbitrary pure state transformations.Below we show that the same is true for the ζ(x), and we also reveal its operational meaning in terms of random-party concurrence distillation.

Total Random-Party Distillation
For the task of total random-party EPR distillation, the original Fortescue-Lo (F-L) protocol gives E (rnd) 2 (W ) = 1 for the W state |W [22].The F-L protocol consists of each party performing the same binary-outcome measurement with Kraus operators Consider one iteration of the protocol in which each party performs this measurement.If all three parties obtain outcome M 0 , then the post-measurement state is M 0 ⊗M 0 ⊗M 0 |W ∝ |W ; the W-state is left on changed, and they all repeat the same measurement in the next iteration of the protocol.On the other hand if one party obtains outcome M 1 while the other two obtain M 0 , then the latter are left in an EPR state.The final possibility is that at least two parties obtain outcome M 1 which breaks all the entanglement in the state.For any δ > 0, the value can be chosen sufficiently small and the number of iterations can be chosen sufficiently large such that an EPR state is obtained with probability > 1 − δ.
In Section 5.1 we identify a precise trade-off between the value δ and the number of rounds needed to obtain a success probability > 1 − δ.The F-L protocol was later generalized to other W-class states.It was shown that for any W-state |x such that x 0 = 0 [9].
Our goal now is to establish a similar result for concurrence distillation.
, as introduced above.Then for any W-class state |x .Furthermore, ζ(x) is an entanglement monotone that strictly decreases whenever party n 2 or n 3 performs a non-trivial measurement.
Remark.The equality in Eq. ( 15) holds for all Wclass states, even those for which x 0 = 0.This is in contrast to Eq. ( 14) which holds only under the assumption x 0 = 0.The value E we are trying to optimize the total average bipartite concurrence for any two parties, without loss of generality we can relabel the parties and assume that As depicted in Fig. 1, the following three-step protocol achieves a total average bipartite concurrence of C = ζ(x) − δ for arbitrary δ > 0.
Step 1.Let Charlie perform a measurement described by the following Kraus Operators, There will be two measurement outcomes.Using the transformation rule of Eq. ( 9), the postmeasurement state of outcome 1 is the bipartite pure state ). Hence from Eq. ( 4), the concurrence of this state weighted by its probability is The post-measurement state of outcome 0 is still a tripartite W-class state, which, when weighted by the probability of outcome 0 has components Step 2. Continuing with the post-measurement state of outcome 0, Bob and Charlie then perform n iterations of a restricted Fortescue-Lo protocol, which involves just Bob and Charlie performing n iterations of the the weak measurement specified by Kraus operators in Eq. ( 13).Here, n is a parameter that we are free to choose, and for a given choice, we take > 0 such that (1 For each iteration of Bob and Charlie measuring {M 0 , M 1 }, they halt if at least one of them obtains outcome 1.If the both obtain outcome 0 then they proceed with another iteration of measurement.This is done for n total iterations.One can verify that if the protocol has not halted by the j th iterations, then outcome 00 in the j th iteration yields the unnormalized state with coordinates ((1 On the other hand, if either outcome 01 or 10 is obtained, then the concurrence of the post-measurement state |φ 2,j , weighted by its probability, is Should the protocol not halt for all n iterations, we find that all all three coefficients end up being equal, (1 Step 3. The original F-L protocol is performed for n iterations on |W with all three parties using measurement {M 0 , M 1 } and a freely chosen parameter.If a bipartite entangled state is obtained in the j th iteration of the F-L protocol, then the probability-weighted concurrence is As we let n, n → ∞ and → 0, the expected concurrence is given by Notice that this protocol uses two unboundedround subroutines: the restricted F-L protocol (step 2) and the total F-L protocol (step 3).In Section 5.2.2 we will explore finite-round approximations and how the expected concurrence changes as we trade round numbers between these two subroutines.

-Random-Party Distillation
The problem of -random-party EPR distillation in W-class states has been partially solved in Ref. [13].
Namely, it was shown that for any W-state |x such that x 0 = 0. Similar to the scenario of total random-party distillation, the value of E ( -rnd) 2 (x) is unknown when x 0 = 0.However, the RHS of Eq. (20) still serves as an upper bound due to the monotonicity of x and η(x).Furthermore, the protocol achieving these distillation probabilities requires no more than three rounds of LOCC.
Our contribution here is solving the -Random-Party distillation problem for concurrence.

Lemma 2. For arbitrary
Proof.In Ref. [13] it was shown that ζ (x) is an entanglement monotone that that strictly decreases on average whenever party or n 1 performs a nontrivial measurement.What remains to be demonstrated is that ζ (x) is indeed an achievable concurrence distillation average for pairs ( , x n1 ) and ( , x n2 ).
Without loss of generality we assume x A > 0 since otherwise the lemma is trivially true.The distillation protocol is very similar to the one given in Lemma 1. Formally, we replace x ↔ x A , x n1 ↔ x B , and x n2 ↔ x C .However, while it was assumed that Step 1 of this protocol is then the same Step 1 as Lemma 1, with Charlie performing the measurement given in Eq. ( 16).The weighted concurrence for outcome 1 is In step 2, Bob and Charlie again perform the restricted F-L protocol; however now there is no halting round.The weighted concurrence of a bipartite state |φ 2,j obtained in round j is C j 2 , as given in Eq. (17).Note that Alice then never performs a measurement in this protocol.After replacing our system labels by { , n 1 , n 2 }, the total expected concurrence as the restricted F-L protocol in step 2 continues indefinitely is 5 Round Complexity in Random-Party Distillation We now turn to the question of round complexity, which is the motivating topic of this work.

Lower Bounds in EPR Distillation
Suppose the parties are only afforded a finite number of LOCC rounds.What is the largest achievable probability of obtaining an EPR state starting from |W ?Here we prove a lower bound on this probability as a function of round number.
Consider any finite-round random EPR distillation protocol P. We can assume without loss of generality that every branch in the protocol ends with either an EPR state |Ψ + or a product state; this is because every weakly entangled state |φ can be transformed into |Ψ + with some probability without extending the number of rounds.Any branch that terminates with an EPR state will be called a success branch (many different success branches will overlap).
Our first step will be to argue that we can always transform the protocol P into one in which Alice and Bob just perform diagonal measurements.As described in the introduction, we can characterize each local measurement by upper-triangular Kraus operators, M i = √ ai bi 0 √ ci .Notice that the compo-sition of two such matrices has form If we then consider the full set of Kraus operators constituting protocol P, each success branch λ will be characterized by a product Kraus operator of the form where each matrix in the tensor product is obtained by concatenating all the local operators along branch  With this proposition, let us consider an arbitrary distillation protocol P with all measurements in diagonal form.As described in Section 2, using weak measurement theory, we can depict the protocol as a tree that starts with the state |W and evolves continuously along different branches.Crucially, the conversion of a general protocol to one in which the coordinates (x A , x B , x C ) transform smoothly can be done without changing the number of LOCC rounds.This is because each local measurement is partitioned into a sequence of weak measurements with no communication needed between each measurement in the sequence.
To proceed with the analysis, we will need to introduce some concepts that characterize a given protocol P. We begin by defining an important class of states that appear in P. Definition 1.A state |x with ordered components x n1 ≥ x n2 ≥ x n3 obtained in P is called a block state if (i) x n1 = x n2 and x n3 = 0, (ii) a party whose component value is maximal is the next to perform a local measurement on |x , and (iii) an EPR state is obtained with some nonzero probability from |x .
Using the concept of a block state, we can analyze an arbitrary random-party distillation protocol in a systematic way.First, with continuous evolution of the states in the protocol, it follows that the party having the largest component value cannot change along a success branch without first passing through a block state.We can then partition the protocol in blocks determined by when there is a change in the party having the largest component value.More precisely, define a sub-block of P as any sub-tree in the overall protocol tree that begins with a block state |x and terminates as soon as it reaches some latter block state |y (which may be equivalent to |x ), an EPR state, or a product state (see Fig. 2).Each terminal block state in one sub-block is then the initial block state of a subsequent sub-block (unless the protocol halts), and the entire LOCC protocol can be divided into a disjoint union of sub-blocks.We say that an N -block protocol is one that traverses N different sub-blocks along the longest branch of the protocol.We further organize the protocol into N layers such that all sub-blocks belonging to layer k have an initial block state laying on a branch that has previously traveled through k−1 sub-blocks.Finally, we introduce the notion of a canonical sub-block.Definition 2. A sub-block is called canonical if all its terminal block states are |W (see Fig. 3 (a)).Hence a canonical sub-block always ends with |W , an EPR state, or a product state.A canonical subblock is called W -canonical if it begins with |W and returns to |W along only one branch within the subblock (see Fig. 3 (b)).
Note that all sub-blocks in the N th layer are canonical since they obtain only EPR and product states.
Our first goal will be to reduce any N -block protocol P into another P whose sub-blocks are Wcanonical.The following technical lemma and its corollary provide the main ingredients for this procedure.
Lemma 3. Suppose that an arbitrary state |x = (x n1 , x n2 , x n3 ) is transformed into either |W , an EPR pair, or a product state with respective probabilities P W , P EPR , and P F .Further suppose that party n 1 (x) maintains the largest component value in each of the final states (if more than one party has maximal value, fix n 1 (x) to be one them).Then Moreover, the upper bound on P EPR is achievable.Proof.Since we are considering only protocols with diagonal local Kraus operators, the average component value for each party remains unchanged.Hence, 2 P EPR + 2 3 P W , which is Eq. ( 25).On the other hand, the η monotone says that Combining this with Eq. ( 25) yields Eq. ( 24).
The upper bound on P EPR is achievable using an "equal or vanish" protocol [8].This involves party n k (x), for k = 2, 3, performing a binary-measurement with Kraus operators .On the other hand, if one and only one party obtains outcome 1, then an EPR pair is obtained.This occurs with total probability Lemma 3 requires that party n 1 (x) have the largest component value in all terminal states of the sub-block.We next relax this condition and derive similar bounds on P EPR and P W when there is no party with maximal value in all the outcomes.To conduct this analysis, let a i and c i be the numbers characterizing the measurement on an initial block state |x .Let A (resp.C) be the set of indices such that a i > c i (resp.c i > a i ).Note that we can always assume that a i = c i for every branch i.Indeed, suppose that a i = c i , which means the state is left unchanged along branch i.If all other branches lead to a higher expected EPR probability than the EPR probability along branch i, then one can just remove branch i and scale the probability of the other branches up to unity.On the other hand, if the subsequent sub-protocol along branch i has a higher expected EPR probability than that along the other branches, one can just perform this sub-protocol on the original state |x .Therefore, all branches with a i = c i have been eliminated without decreasing the overall EPR probability or the number of rounds.Since |x is a block state, we have x n1 = x n2 ≥ x n3 .The post-measurement states of |x will thus split into two categories: for outcomes i ∈ A we will have a i x n1 ≥ a i x n3 > c i x n1 and for outcomes i ∈ C we will have c i x n1 > a i x n1 ≥ a i x n3 .In both these cases, the conditions of Lemma 3 apply since the party having maximal component cannot switch without passing through another block state, and by assumption, the only block state obtained within the given sub-block is |W (for which all parties have maximal component value).We can then prove the following corollary.
Corollary 1. Suppose that |x is the initial block state of a canonical sub-block and a local measurement is performed with Kraus operator parameters a i and c i .Let a = i∈A a i , c = i∈A c i , and ã = a − c.Then 26) Furthermore, the upper bound of P EPR can be saturated using binary-outcome measurements that obtain |W along just one branch.
Proof.Group outcomes i into sets A and C as described above.Applying Lemma 3 to each of the post-measurement states yields where the second inequality follows from the Cauchy-Schwarz Inequality: and the third comes from the fact that c 2 .The equality of P W again follows from Lemma 3 as The upper bound on P EPR can be achieved by having party n 1 (x) perform a binary outcome measurement with Kraus operators For outcome 2, an EPR pair can be subsequently obtained between parties n 2 (x) and n 3 (x) with probability 2ãx n3 .For outcome 1, the "equal or vanish" protocol is subsequently performed, as described in Lemma 3. Since x n1 = x n2 , the total EPR probbility is then From the "equal or vanish" protocol we also obtain a W-state with probability We next convert an arbitrary canonical subblock into a W -canonical sub-block.Lemma 4. Any canonical sub-block of P can be replaced by a W -canonical sub-block without decreasing the total EPR probability or increasing the total number of sub-blocks in the protocol.
Proof.Let S be any canonical sub-block starting with block state |x .Let a i and c i be the Kraus operator parameters of the initial measurement as before.We transition S into a W -canonical sub-block in two steps.First, we limit the the total number of branches within the sub-block that obtain |W to at most one.In general, there may be multiple branches within the sub-block that obtain state |W .For each of these, the protocol P specifies a subsequent distillation sub-protocol to be performed on |W .Let Q max denote the maximum probability of obtaining EPR pairs from |W among all these sub-protocols.Therefore, if P S is the total probability of obtaining EPR pairs when starting from the block state |x , we can use Corollary 1 to obtain the bound Since the coefficient of P EPR is non-negative on the RHS, we can further bound P S using Eq. ( 26).This bound can be saturated by performing the optimal binary-outcome measurement scheme of Corollary 1 on the initial block state |x .For the single |W outcome, the original sub-protocol that attains EPR probability Q max is then performed.Hence, if P S denotes the total EPR probability starting from |x in our modified protocol P , then One can verify that Hence, we have transformed an arbitrary canonical sub-block S into a W -canonical sub-block without decreasing the total EPR probability or increasing the number of sub-blocks in the protocol.
Corollary 2. Any N -block LOCC protocol P performing random-party EPR distillation can be converted into an N -block protocol consisting only of W -canonical sub-blocks and obtaining EPR pairs with at least as large as a probability as P.
Proof.All sub-blocks at layer N are canonical, and we perform Lemma 4 to convert them to Wcanonical sub-blocks.We then move backward to layer N −1.All of these sub-blocks must be canonical since their terminal block states have been converted to |W in the previous step.We can thus transform all these canonical sub-blocks to W -canonical subblocks.Proceeding iteratively layer-by-layer back to the original state |W , we obtain the stated structure of Corollary 2.
We now possess the necessary components to prove a general lower bound.
Theorem 1.Let P tot be the total probability of obtaining EPR pairs from |W in some LOCC randomparty distillation protocol P. Then Moreover, this bound is tight.
Proof.Suppose that P is a finite-round protocol; otherwise Eq. ( 34) holds trivially.In light of Corollary 2, we can assume without loss of generality that P consists only of W -canonical sub-blocks.For a non-negative number a k , the probability of distilling EPR pairs in the 2 is the probability of obtaining |W and therefore carrying the protocol onto subblock k +1.Consequently, the total EPR probability for an N -block protocol is where a N = 1.To optimize, we compute partial derivatives, which ultimately leads to a system of equations:

2
. Substituting back into P tot , we find that for any N -block LOCC distillation of |W , This upper bound can be achieved by using measurements with the optimal values of a k .Finally, in terms of LOCC round number, we have that each sub-block contains at least three rounds unless it is the final sub-block, which only requires one subblock since it involves fixed-party distillation.Therefore, we obtain Eq. ( 34) with a corresponding plot in Fig. 5.
Remark.Throughout this section we have been modeling an LOCC protocol as a discrete sequence time steps in which only a single party measures at each step.In this model, the round number of the protocol is the largest number of times the measuring party switches in some branch.By only allowing one party to measure at a time, the protocol is easier to analyze.However, a more general model allows all parties to measure at each time step.This has been referred to as broadcast LOCC (BLOCC) in Ref. [24], and this model has also been consider in Ref. [42].In the BLOCC setting, a protocol's round number is then just the largest number of broadcasts made along some branch.In Section 5.2.2, we adopt the BLOCC model to count the number of iterations in the F-L protocol for convenience.
Notice that the BLOCC round complexity of a task can be reduced by no more than a factor of N , compared to the "one at a time" round complexity.In terms of random-party EPR distillation of |W , the lower bound of Theorem 1 becomes

Number of classical broadcasts in
This is also tight since each W -canonical sub-block requires only one round of broadcast communication to implement.
Remark.The lower bound of Theorem 1 matches in form the original F-L distillation probability [22] as well as the slightly optimized distillation probability by Li et al. [31].

Infinite Rounds is Optimal
We now turn to the task of random-party concurrence distillation.We first establish that every finiteround protocol is sub-optimal.
Theorem 2. Suppose P is an N -round total random-party concurrence distillation protocol on a tripartite W-state |x .Then there exists a protocol P having more than N rounds but distills a larger average concurrence than P. The same statement holds for -random-party concurrence distillation.
Proof.Consider the last instance in the protocol for which some local measurement is performed on a tripartite entangled W-state.Call this state |x .Since this is the final time such a state appears in the protocol, the local measurement must be a hard measurement that completely decouples the measuring party the other two.Moreover, the protocol halts after this measurement since any further bipartite processing can never increase the concurrence.The greatest average post-measurement concurrence obtained from a hard measurement on |x is called the concurrence of assistance (CoA) [29], and if (x A , x B , x C ) are the coordinates of state |x , then where i, j ∈ {A, B, C} are the two non-measuring parties [29].It is obvious that both ζ(x ) and ζ (x ) are strictly larger than this value.Hence if instead of performing the hard measurement and terminating the protocol the parties were to continue with the protocols of Lemma 1 and 2, respectively, they would obtain a greater average concurrence.
This result shows that infinite-round LOCC (i.e.protocols with unbounded round number) is needed to optimally distill concurrence from W-class states.
Remark.As noted above Lemma 2, for the task of -random-party EPR distillation, an optimal protocol can be found that consumes only three rounds of LOCC.Thus, we see that finite rounds of LOCC suffice to optimize the entanglement measure E , whereas infinite rounds of LOCC are needed to optimize the measure C ( −rnd) , even though the tasks share the same goal of breaking tripartite entanglement into bipartite form.
Remark.The proof of Theorem 2 uses the fact that both ζ and ζ are defined on the entire class of Wstates, including those states (x A , x B , x C ) for which x 0 = 0.In contrast, the proof will not directly work for E (rnd) 2 since the equality E (rnd) 2 (x) = κ(x) only holds for states with x 0 = 0, and a general LOCC protocol might attain states for which the latter condition does not hold.However, by using Proposition 1 and its generalization to arbitrary W-class states, one can assume without loss of generality that only diagonal Kraus operators are used on the protocol.Hence if the initial states x has x 0 = 0, then it will remain zero throughout.Consequently, we can repeat the same argument as Theorem 2 and likewise conclude that E (rnd) 2 (x) is not achievable in finite round LOCC whenever x 0 = 0.

Lower Bounds for F-L Protocols
We now consider the problem of finite round concurrence distillation.Specifically, we restrict the protocol to be the ones in Section 3 and 4, but allow the parameters in the F-L measurements to vary depending on the round number.We find the optimal choices of these 's given the total number of BLOCC rounds.
-Random-Party Concurrence Distillation.We first find the optimal protocol for the -random-party concurrence distillation task.We consider just the type of protocol specified in Section 4.
The first step is always a hard measurement on party n 2 if x n2 = x n1 , but can be omitted if they are equal.We thus exclude this step in our round counts for either case, assuming an additional round if x n2 = x n1 .The rest of the protocol consists of Fortescue-Lo measurements but with the k now able to vary for each round k = 1, . . ., N − 1.In the last round, we perform a hard measurement on party n 2 to retrieve the concurrence in the bipartite state between party and n 1 .Then the concurrence distilled for round k is Our goal is to maximize the total concurrence N k=1 C k by varying k , k = 1, . . ., N .By requiring the partial derivatives to be zero, the k 's are calculated to be where Note that each k only depends on j 's with j > k.So these parameters can be calculated one by one in a descending manner, starting from N −1 = 1/3.
For a |W state, the relation between the number of rounds and the optimal average concurrence is plotted in Fig. 6.Note that as the number of rounds increases, the concurrence distilled approaches the asymptotic bound ζ .Total Random-Party Concurrence Distillation.We next find the optimal finite-round parameters for total random party concurrence distillation protocol described in Section 3. Notice that in this case there are two separate steps (Step 2 and 3) that require infinitesimal measurements to achieve the asymptotic bound.Therefore, for a fixed finite number of rounds N , we distribute it so that we spend N 2 in step 2 and N 3 in step 3 with N = N 2 + N 3 .
In step 2, the expected concurrence distilled for round k is where k = 1, . . ., N 2 .As in the asymptotic protocol, we require that Step 2,k over k 's subject to the constraint above.Using the Lagrange multiplier method, the maximizing condition is Solving this we get Therefore, every k can be expressed in terms of λ, and the exact number can be calculated by plugging into the constraint that In step 3, the expected concurrence distilled for round k is with a hard measurement on party C in the end.
(With an abuse of notation we use k 's for the parameters in both step 2 and step 3, but they are different variables.)With a fixed N 3 , we wish to maximize C Step 3,N3 := N3 k=1 C Step 3,k over k 's.By requiring the partial derivatives with respect to k 's to be zero, we get where Again, the k 's can be calculated from the last one to the first, and the total concurrence distilled can be calculated by plugging the values into (45).
The final goal is to determine the best way to distribute the total number of rounds N into N 2 and N 3 , i.e.

A Family of Maps on LOCC
The restricted Fortescue-Lo protocol described in Section 3 served as an example of a map on the boundary of LOCC but not in LOCC [14], thus proving that LOCC is not closed.Now we show that the operations in the step 2 of the above protocol indicate that there exist a family of maps {J γ : 0 < γ < 1} on the LOCC boundary but not in LOCC, ie.LOCC J γ / ∈ LOCC for 0 < γ < 1.
We define J γ as follows.First fix a 0 < γ < 1.Let Bob and Charlie both perform the measurement If the joint outcome is one of 01, 10, or 11, they stop.
If the joint outcome is 00, then they repeat the same measurement.After a maximum number of n iterations they stop.The and n are deliberately chosen so that (1 − ) n = γ.We coarse-grain the instrument to obtain J γ,n = (E γ n00 , E γ n01 , E γ n10 , E γ n11 ) where E γ nij include all the cases when Bob and Charlie stop upon obtaining the outcome ij.Thus the four maps are respectively generated by the following sets of Kraus operators: The map J γ is obtained by taking lim n→∞ J γ,n .The Choi matrices {Ω γ ij } of J γ can be obtained from a similar calculation as in [14].We have Thus the corresponding instruments are It is evident that this map is on the boundary of LOCC.We argue that this map indeed does not belong to LOCC.For any 0 < γ < 1, we consider the W-class state represented by (x, γx, γx).The discussion in the step 2 of the previous section showed that performing the map J γ will optimally distill expected concurrence, thus not decreasing the value of ζ(x).
The same can never be achieved by any LOCC protocol, because we have shown that any nontrivial measurement performed by Alice or Bob will strictly decrease the expected distillable concurrence.

EPR distillation with PPT and SEP Instruments
So far we have discussed converting | x to an EPR state between two of the three parties using LOCC.A natural question is how this bound relates to the same probability when one only restricts the quantum instrument to being positive partial transpose (PPT) with respect to all parties.If we consider only the cases where two of the three parties end up with an EPR state or declare the outcome a failure (i.e.there is no outcome of distilling a W state), then we can coarse-grain the instrument to having a single round and four branches-one for each pair of parties and one for failure.We can then express the failure probability of this instrument on an arbitrary tripartite state ρ ABC as a semidefinite program (SDP).(56) Recalling that for a linear map N A→B and linear operator X A , N (X) := Tr A [J N (X T ⊗ I B )] where J N is the Choi operator, it's clear that J 0 represents the failure outcome, so the objective function is minimizing the failure probability.The next constraint guarantees coarse-graining the instrument results in a quantum channel as required.The following three constraints guarantee each success outcome is an instrument outcome that only maps the input state to an EPR state shared between two of the parties.The next line is the PPT constraints on the elements of the instruments, and the final line is simply positivity constraints on the instrument elements.Therefore we can see we have constructed an SDP for the relevant scenario.
We consider the set of s-states which are a specific case of | x states.Implementing the SDP (56) on the s-states numerically using CVX [19] with the solver MOSEK [2] along with QETLAB [26] in MATLAB [33], we find the gap between PPT and LOCC P EPR (See Fig. 9), where the LOCC curve is given by 2(1 − s) − (1 − s) 2 /4s [9].Fig. 9 shows a gap between LOCC and PPT success probability, as well as the optimal strategy for s ∈ [1/2, 1] establishes EPR pairs only between Alice and the other two parties.A natural further question would be if a separation between LOCC and separable operations also exists in this regime.In [9], it was shown that a SEP operations could achieve a success probability of one for s ∈ [1/3, 1/2], which means it agrees with the PPT curve in Fig. 9 in this regime.The remaining question is where a separation between LOCC and SEP exists for s ∈ (1/2, 1], as well as the possibility of a gap between SEP and PPT.We now show the above PPT curve is the same as the SEP curve.Note that Fig. . Thus, the PPT curve in Fig. 9 is the same as the SEP curve, and there exists a gap between random party EPR distillation from |ψ s using SEP and LOCC operations for all s ∈ (1/3, 1).

Conclusion
In this paper, we have studied how more rounds of classical communication can increase the expected bipartite entanglement yield in random-party distillation protocols.Interestingly, the optimal number of rounds needed is highly sensitive to the type of entanglement measure considered.For certain distillation tasks, an unbounded number of rounds is needed if the bipartite concurrent measures is optimized, whereas a finite number is needed if the E 2 measure is optimized.Tight lower bounds were derived on the number of LOCC rounds required to attain a certain entanglement value.
Our work focused on W-class states since they possess a form highly amenable for analysis in the LOCC setting.In the future, it would be interesting to continue the study of round complexity for other types of multipartite entangled states.We conjecture that high round complexity is a general feature of random-party entanglement distillation.Intuitively, sequences of alternating weak local measurements enable a more "gentle" and less lossy decoupling of certain parties than when performing a hard decoupling measurement in one just one iteration of local This leads to a greater entanglement yield when using more rounds of LOCC.We leave it as future work to further test and develop this intuition.

A Concurrence of Bipartite W-class States
In all concurrence distillation protocols that we discuss, the bipartite outcome of any LOCC branch must be a bipartite W-class state of the form This is due to the fact that the W-class is closed under LOCC.The concurrence of such state is given by 2|σ 1 σ 2 |, where σ 1 and σ 2 are the Schmidt coefficients of this state.The Schmidt coefficients are the singular values of the matrix which are the square roots of the eigenvalues of the matrix The eigenvalues are the roots of the polynomial λ 2 − (x 0 +x 1 +x 2 )λ+x 1 x 2 = 0. From Vieta's formulas, we have

B Monotonicity of ζ Lemma 2 (Monotonicity part). The function ζ(x)
is non-increasing on average under LOCC.Furthermore, it is strictly decreasing on average when either n 1 (x) or n 2 (x) performs a non-trivial measurement.
Proof.The proof is analogous to the one given in [14] which proved that ζ is LOCC monotone.We can decompose any local measurement into a sequence of binary measurements [1][36].Thus each measurement can be specified by two Kraus operators {M λ : λ = 1, 2}, where a λ , c λ ≥ 0, b λ ∈ C and U λ is unitary (which does not affect the representation x.)By the completeness relation thus a 1 +a 2 = 1 and c 1 +c 2 ≤ 1 where equality holds if and only if b 1 = b 2 = 0.
To see how the state is changed by such measurement, consider when the first party Alice measures, then where p λ is the normalization constant that also indicates the probability of obtaining outcome λ.
The post-measurement states can be represented by x λ = ( c λ p λ x A , a λ p λ x B , a λ p λ x C ) if c λ x A ≥ a λ x B , otherwise the ordering should be changed such that x λ = ( a λ p λ x B , c λ p λ x A , a λ p λ x C ).More generally, when party K ∈ {A, B, C} measures and outcome λ occurs, the components of the representing vector x λ are { c λ p λ x K , a λ p λ x I , a λ p λ x J } after sorting in decreasing order, where I, J ∈ {A, B, C} \ {K}.(61) We will show that ∆ζ ≤ 0 for all six possible cases.

Case (i) Alice measures when x
Since any measurement can be decomposed as a sequence of weak measurements (ie.ones where a 1 ≈ c 1 and a 2 ≈ c 2 ) [36] [37], without loss of generality we can assume the measurement is sufficiently weak so that c λ x A ≥ a λ x B is guaranteed.Thus the post-measurement states can be represented by p λ x λ = (c λ x A , a λ x B , a λ x C ), and ∆ζ is calculated to be ∆ζ = 2 A necessary condition for ∆ζ to be maximized is that it is maximized over allowed values of c 2 when a 1 , a 2 and c 1 are fixed.Taking the partial derivative, we have where inequalities follow from the weakness constraint.Thus, fixed a 1 , a 2 and c 1 , the value of ∆ζ is maximized when c 2 = 1 − c 1 .Substitute this into (62) and evaluate around the point (a 1 , c 1 ) = (1/2, 1/2), we expand to the lowest order of (a 1 − 1/2) and (c 1 − 1/2) and obtain (64) Let γ = x B x A , then γ ≤ 1, and WLOG, we assume that the measurement is weak enough so that a λ x A ≥ c λ x B ≥ a λ x C .Then the post-measurement states can be represented by p λ x λ = (a λ x A , c λ x B , a λ x C ), and ∆ζ is calculated to be ∆ζ = ∆ζ + 1 3 WLOG, we assume that the measurement is weak enough so that a λ x B ≥ c λ x C .Then the postmeasurement states can be represented by p λ x λ = (a λ x A , a λ x B , c λ x C ), and ∆ζ is calculated to be ∆ζ = ∆ζ + 1 3 Again, ∆ζ ≤ 0 holds in this case.But since Charlie correspond to the party n 2 when calculating ∆ζ , non-trivial measurement performed by Charlie does not necessarily decrease ζ [14], and equality holds whenever c 1 + c 2 − 1 = 0. Case (iv) Bob measures when x A > x B = x C .
WLOG, we assume that a 1 ≤ c 1 (thus a 2 ≥ c 2 .)We also assume that the measurement is weak enough so that a 1 x A ≥ c 1 x B .Then the postmeasurement states can be represented by and ∆ζ is calculated to be ∆ζ = ∆ζ + 1 3 Again, ∆ζ ≤ 0 holds in this case with equality obtained only by the trivial measurement where a 1 = c 1 and a 2 = c 2 .Case (v) Alice measures when WLOG, we assume that a 1 ≤ c 1 (thus a 2 ≥ c 2 .)We also assume that the measurement is weak enough so that c 2 x A ≥ a 2 x B .Then the postmeasurement states can be represented by and ∆ζ is calculated to be ∆ζ = ∆ζ + 1 3 Again, ∆ζ ≤ 0 holds in this case with equality obtained only by the trivial measurement where a 1 = c 1 and a 2 = c 2 .

Case (vi) Alice measures when x
WLOG, we assume that a 1 ≤ c 1 (thus a 2 ≥ c 2 .)Then the post-measurement states can be represented by p 1 x 1 = (c 1 x, a 1 x, a 1 x), (75) and ∆ζ is calculated to be ∆ζ = ∆ζ + 1 3 Again, ∆ζ ≤ 0 holds in this case with equality obtained only by the trivial measurement where a 1 = c 1 and a 2 = c 2 .
for general W-class states remains an open problem.Proof.The monotonicity of ζ(x) and it strictly decreasing under measurement of party n 2 and n 3 is proven in Appendix B.Here we prove the achievability.Consider an arbitrary W-class state |x =

Figure 1 :
Figure 1: A depiction of the protocol described in Lemma 1.

2 .Proposition 1 .
operators for some local measurement, then it is replaced by the Kraus operators M i = Hence we have established the following simplification.The optimal success probability for random-party EPR distillation of |W in N rounds can always be obtained by an N -round protocol in which all parties perform local measurements with diagonal Kraus operators.

Figure 2 :
Figure 2: Any random-distillation protocol P can partitioned into disjoint sub-blocks (enclosed by dashed line).Each sub-block starts with a block state and terminates with either another block state, an EPR state, or a product state.EPR and product states are indicated by diamond end points.Block states are obtained along any edge intersecting a dashed line.A terminal block state of one sub-block is the initial block state of a subsequent sub-block unless the protocol halts.

Figure 3 :
Figure 3: (a) A canonical sub-block is one in which every terminating block state is |W .(b) A W -canonical subblock is a canonical sub-block whose initial block state is |W and that has at most one terminating block state.

Figure 4 :
Figure 4: Lemma 4 describes the reduction of a canonical sub-block S to a W -canonical sub-block in two steps.The first is to reduce the number of terminal |W states to just one.The second replaces the initial block state |x with |W by performing a prior measurement on |x in the preceding sub-block.

Figure 5 :
Figure 5: Minimum number of LOCC rounds to obtain EPR pairs from |W with probability Ptot.This lower bound is tight.

Figure 6 :
Figure 6: Minimum number of BLOCC rounds to obtain average bipartite concurrence C ( from |W .Comparison is made between choosing a uniform and optimizing k for each round.

Figure 7 :
Figure 7: Maximum average bipartite concurrence that can be distilled using N2 number of BLOCC rounds in Step 2 and N3 in Step 3.Each red bar represents the best value in the diagonal line where N2 + N3 is constant.Notice that for different states there are different ways to optimally distribute the number of rounds into each step.For states with a large ratio between xA and xC , more rounds are needed in Step 2 to gradually lower the ratio.

Figure 9 :
Figure 9: This depicts the gap between random party EPR distillation using a (and SEP) instrument and an LOCC instrument.It also depicts the asymmetry in PPT distillation of EPR states shared between Alice and Bob or Charlie versus Charlie (resp.Bob) with Bob (resp.Charlie) or Alice.
[32]Thus, C(rnd)(Ψ W ) is the largest average pure-state concurrence obtainable from |Ψ W .We will refer to the task of optimizing the expected bipartite concurrence as random-party concurrence distillation.Another entanglement measure is given E 2 (ϕ) = 2λ min [Tr X |ϕ ϕ|], which is twice the smaller eigenvalue of the reduced density matrix Tr X |ϕ ϕ|.Operationally, it is known that E 2 (ϕ) corresponds to the supremum probability of transforming the state |ϕ to an EPR state by LOCC[32]. From this it immediately follows that E performed by Alice, Bob, and Charlie, respectively.Since the only maximally entangled twoqubit state of the form |x is |Ψ + , we must have T λ |W ∝ |Ψ + |0 , with |Ψ + being held by some pair of parties.The branch probability is then given by | Ψ + | 0|T λ |W | 2 , which is independent of the offdiagonal terms b k,λ in T λ .Furthermore, by repeatedly applying the composition rule of Eq. (23), the diagonal terms in T λ only depend on the diagonal terms of the individual local measurements in each round.Hence, we would obtain the same success branch probabilites | Ψ + | 0|T λ |W | 2 if Alice and Bob only performed the diagonal parts of their local Kraus operators.In general, the diagonal parts of the Kraus operators will not form a complete measurement themselves, but they can easily be completed without affecting the overall success of the protocol.Specifically, if the M i = they both obtain outcome 1 the resulting state is |W , and this occurs with probability P W = 3

)
Next, we modify the initial block state |x to be |W (if |x = |W ).This is accomplished by stochastically transforming |x into |W or |Ψ + in the sub-block prior to S by having party n 3 (x) measure with Kraus operatorsM 1 = Diag[ x n3 /x n1 , 1] and M 2 = Diag[ 1 − x n3 /x n1 , 0].The probability of obtaining |W (outcome 1) is 3x n3 while the probability of |Ψ + (outcome 2) is 2(x n1 − x n3).Now |W is the initial block state of S. The optimal transformation of Corollary 1 is performed on |W with the measurement parameter ã being the same as in P .This yields EPR states within sub-block S with probability P EPR = 2 3 (1 − ã) + 2 3 − 4 3 (1 − ã) 2 = 2 3 (3 − 2ã)ã,while the W-state is obtained with probability P W = (1 − ã) 2 .For the |W outcome, the Q max sub-protocol is again performed in subsequent sub-blocks.Letting P S denote the total EPR probability starting from |x in this new protocol P , we have 9, up to numerical error, shows that for PPT operations P EP R (s) = 2(1 − s) for s ∈ [1/2, 1].Consider |ψ s where s ∈ [1/2, 1].Consider Alice applying a quantum instrument to her local portion where the first outcome uses a Kraus operator M 0 = |0 0| + (1 − s)/s |1 1|.One can verify that the state conditioned on that outcome is |ψ 1/2 .Moreover, this calculation shows the probability of this outcome is 2(1 − s).By [9], we know |ψ 1/2 can be distilled to an EPR with success probability one with SEP operations.Therefore, for |ψ s