Better local hidden variable models for two-qubit Werner states and an upper bound on the Grothendieck constant $K_G(3)$

We consider the problem of reproducing the correlations obtained by arbitrary local projective measurements on the two-qubit Werner state $\rho = v |\psi_-><\psi_- | + (1- v ) \frac{1}{4}$ via a local hidden variable (LHV) model, where $|\psi_->$ denotes the singlet state. We show analytically that these correlations are local for $ v = 999\times689\times{10^{-6}}$ $\cos^4(\pi/50) \simeq 0.6829$. In turn, as this problem is closely related to a purely mathematical one formulated by Grothendieck, our result implies a new bound on the Grothendieck constant $K_G(3) \leq 1/v \simeq 1.4644$. We also present a LHV model for reproducing the statistics of arbitrary POVMs on the Werner state for $v \simeq 0.4553$. The techniques we develop can be adapted to construct LHV models for other entangled states, as well as bounding other Grothendieck constants.

A quantum Bell experiment consists of two (or more) distant observers performing local measurements on a shared entangled quantum state. Remarkably the predictions of quantum theory are here incompatible with a natural definition of locality formulated by Bell [1]. Specifically, the statistics of certain quantum Bell experiments are found to be nonlocal (in the sense of Bell), as witnessed via violation of Bell inequalities. This phenomenon, referred to as quantum nonlocality, represents a fundamental aspect of quantum theory as well as a central resource for quantum information processing [2].
While the use of an entangled state is necessary for observing quantum nonlocal correlations, it is interesting to ask if the converse link also holds. That is, can any entangled state lead to a Bell inequality violation, when performing a set of (judiciously chosen, and possibly infinitely many) local measurements? For pure entangled states, the answer turns out to be positive [3]. For mixed entangled states, the situation is more complex, as first discovered by Werner [4], who presented a class of entangled quantum states (now referred to as Werner states) which admit a local hidden variable (LHV) model for any possible local projective measurements. Therefore such states, while being entangled-hence nonclassical at the level of the Hilbert space-can never lead to nonlocal correlations. Notably, while Werner's original model focused on projective measurements, Barrett [5] presented a LHV model considering the most general non-sequential measurements, i.e. POVMs. These early results triggered much interest, and subsequent works presented various classes of entangled states admitting LHV models [6][7][8][9][10][11][12][13], including results for the multipartite case [14,15]; see [16] for a recent review. More sophisticated Bell scenarios have also been explored [17], but will not be discussed here.
In parallel, Acín, Gisin and Toner [19], based on previous work by Tsirelson [18], established a direct connection between these questions and a purely mathematical problem discussed by Grothendieck. In particular, the prob-lem of determining the range of visibilities v ≤ v c (v c denoting the critical visibility) for which the two-qubit Werner state admits a LHV model for arbitrary projective measurements, is directly related to the Grothendieck constant (of order 3) K G (3). Specifically, one has that v c = 1/K G (3). While the exact value of the Grothendieck constants are generally unknown, existing (upper) bounds can be used for deriving lower bounds on v c . Notably, a result of Krivine [22] for bounding K G (3) implies that v c ≥ 0.6595. Note also that upper bounds on v c can be obtained by demonstrating explicit Bell inequality violations. The best result so far are v c ≤ 0.7054 [23,24] and very recently v c ≤ 0.7012 [25].
In this work, we present better LHV models for twoqubit Werner states. We first consider the case of projective measurements and prove analytically that which also leads to a better bound on the Grothendieck constant, as stated above. This result is derived by combining two recently introduced numerical methods. The first one is an algorithmic method for constructing LHV models [26,27], which is applicable to arbitrary entangled states. The second is a numerical optimization algorithm for estimating the distance between a point and a convex set in R d [25]. From the output of these numerical methods, we construct the LHV model analytically.
Our second result is a better LHV model for the twoqubit Werner state considering arbitrary POVMs. The model works for v ≤ 341/750 ≃ 0.4547. The proof relies on the fact that the statistics of arbitrary POVMs featuring a certain level of noise can be reproduced exactly via projective measurements only. By applying this observation to the LHV model we construct for projective measurements, the result follows.
More generally, we believe that the methods presented here open promising new possibilities for the construction of LHV models for quantum states, as well as for other convex set membership problems, such as separability of quantum states. We conclude by discussing several possible directions for future research.

CONCEPTS AND NOTATIONS
The statistics of local measurements performed on a two-qubit Werner state (1) are given by Here A a|x and B b|y are the operators representing the local measurements of Alice and Bob. They satisfy positivity and normalization, i.e. A a|x ≥ 0 and a A a|x = 1 1, and similarly for B b|y . These represent general POVMs, which we will consider in the second part of the paper. In the first part, however, we focus on the case of local projective measurements, i.e. adding the constraints a, b ∈ {−1, +1} and A a|x A a ′ |x = δ aa ′ A a|x for all x, a, a ′ and similarly for B b|y . Our goal is to determine the range of visibilities v for which the Werner state (1) admits a LHV model. That is, the measurement statistics (3) can be decomposed as where λ represents the local variable, distributed according to the density q(λ), and the distributions p A (a|xλ) and p B (b|yλ) are Alice and Bob's local response functions. If such a decomposition can be found for all local projective measurements, the state ρ(v) is said to be local for projective measurements. Moreover, if this decomposition can be extended to all local POVMs, ρ(v) is termed local for POVMs. As mentioned above the case of projective measurements has a strong connection to the Grothendieck constant, a mathematical constant arising in the context of Banach space theory. Local dichotomic projective qubit measurements are conveniently described via observables of the form where the measurement directions are given by unit vectorsx andŷ on the Bloch sphere, i.e.x,ŷ ∈ R 3 . The measurement statistics of the Werner state (1) are then simply characterized by the expectation values The problem is now to find the largest visibility, v c , such that the above statistics admits a LHV model. For any visibility v > v c , Bell inequality violation is then possible, even though the Bell test may require an infinite number of local measurements. Interestingly this problem can be directly related to another purely abstract problem, discussed by Grothendieck. Consider any possible m × m matrix M such that for any real numbers α i , β j ∈ [−1, +1]. Then, K G (n) is the smallest number such that for any unit vectorsα i ,β j ∈ R n and for any matrix M . This defines a set of numbers K G (n), called the Grothendieck constants of order n. The Grothendieck constant is then defined as K G = lim n→∞ K G (n). While the exact values of these constants is not known in general (except for n = 2 where K G (2) = √ 2), lower and upper bounds were proven, see e.g. [20,21]. Of particular relevance to the present work is the constant of order 3, which relates to v c . Indeed one has that as shown in Ref. [19], see Theorem 1. This connection follows from early work by Tsirelson [18], who connected the Grothendieck's problem to Bell inequalities. Basically, the matrix M is associated to a Bell inequality, for which the local bound is 1, see equation (7). The largest possible violation of this Bell inequality requires, in general, a maximally entangled state of dimension d × d, where d = 2 ⌊ n 2 ⌋ [18]. It then follows that K G (3) is the largest possible Bell violation for a maximally entangled two-qubit state, from which (9) follows. We refer the reader to Ref. [19] for more details.

LHV MODEL FOR PROJECTIVE MEASUREMENTS
Our main result is the construction of a LHV model for projective measurements on Werner states ρ(v) for a visibility v = 0.682. This implies the novel bounds on the critical visibility v c and hence also on the Grothendieck constant K G (3) stated in equation (2).
We make use of a recently developed method for constructing LHV models for entangled quantum states [26,27]. The method is algorithmic, in the sense that it is applicable to any entangled state in principle, and can be efficiently implemented on a standard computer (at least for low dimensions). For the case of interest to us, i.e. local projective qubit measurements on a Werner state, the method can be intuitively explained as follows.
Consider a finite set of qubit projective measurements, represented by a finite set of Bloch vectorsû i , with i = 1, ..., m. These vectors form a polyhedron P contained in the Bloch sphere. Typically, the vectors will be chosen rather uniformly over the sphere, such that the radius of the largest sphere inscribed in P (and centered at the origin) is close to 1. We refer to this radius as the 'shrinking factor' η of the polyhedron P.
Next we consider the measurement statistics obtained by performing the above set of local measurements (for both Alice and Bob) on the Werner state. Specifically, we get with Here σ denotes the vector of Pauli matrices. Since we consider a finite set of m measurement settings (for both Alice and Bob), one can find the maximal visibility v * for which the above measurement statistics admits a LHV model. In practice, this can be done efficiently (at least for m ≤ 10) using linear programming, see e.g. [2].
As the Werner state ρ(v * ) is local for the set of measurements given by Bloch vectorsû i , it follows that ρ(v * ) is also local for any noisy measurement of the form Note that the above measurements form a continuous set, forming a 'shrunk' Block sphere, given by vectors ηû where η is the shrinking factor of the polyhedron P.
As these shrunk vectors lie on the sphere inscribed in P, they can be expressed as convex combinations of the finite set of vectorsû i . One can then show that the distribution p(ab|xŷ) = Tr[ρ(v * )A η a|x ⊗ B η b|ŷ ] is local, for any possible measurement directionsx andŷ. This follows from the linearity of the trace rule, and from the fact that the noisy measurement operators A η a|x , respectively B η b|y , can be expressed as convex combinations of the noiseless operators A a|x , respectively B b|y ; see [26,27] for more details.
Finally, note that the statistics of the noisy measurements (11) on ρ(v * ) are in fact equivalent to the statistics of noiseless measurements on a slightly more noisy Werner state ρ(η 2 v * ). Indeed, it is straightforward to verify that ] which again holds for any possible measurement directionsx andŷ. We thus conclude that the Werner state ρ(η 2 v * ) admits a LHV model for all local projective measurements.
As mentioned above, this method can be implemented in practice using standard tools when considering sets of relatively few measurements (m ≤ 10). In Ref. [26], this was used to construct a LHV model for the Werner state for visibilities up to v = 0.54. While this construction improves on Werner's original model, which attained v = 1/2, it does not reach the best known value so far of v = 0.6595 obtained in Ref. [19] based on the connection to the Grothendieck constant. However, a remarkable feature of the above algorithm is the fact that it will converge to v c when m → ∞ [26]. Therefore by running the method for finite sets of measurements featuring a large (but nevertheless finite) number of vectors, one can expect to approach the optimal visibility v c . In particular, one may expect to overcome the best known value of v = 0.6595 in case the latter is suboptimal. This is precisely what we implemented, using sets containing up to m = 625 vectors. This allows us to obtain the new bounds stated in equation (2).
It is however non-trivial to run the algorithm for sets containing so many vectors. Let us discuss why. Since, the local marginals vanish (see Eq. (6)), we can restrict ourselves to the set of joint correlation terms { a x b y } x,y . Therefore, in case of k binary measurements per party, the local polytope with completely random marginals, which we call correlation polytope and denote by L, lives in dimension m 2 . Each vertex of L corresponds to a local deterministic strategy , may take the values ±1, which amounts to 2 2m distinct strategies. A given λ strategy translates to a vertex D λ , which is a m×m matrix with entries D λ (x, y) = a x b y . Hence, the polytope L features altogether 2 2m vertices.
Our goal is now to decompose a given quantum point q, whose entries are q(x, y) = a x b y = −vx ·ŷ, as a convex combination of deterministic vertices: q = λ w λ D λ . This proves that q is local. In principle, this problem can be solved via linear programming. However, for m = 625 settings, even inputting all deterministic strategies to our linear programming solver is completely out of reach.
In order to circumvent these problems, we resort to a modified version of Gilbert's algorithm [28], a popular collision detection method used for instance by the video game industry. This algorithm can provide a q ǫ such that q − q ǫ ≤ ǫ without a full vertex characterization of the local polytope. The reader is referred to Ref. [25] for more details about this algorithm and its extension including convergence properties and further applications in quantum information. The algorithm is iterative and is given by: 1. Set i = 0 and pick an arbitrary point q i inside the polytope L.
2. Given the point q i and the target point q, run an oracle which maximizes the overlap ( q − q i ) · l over all l ∈ L. Let us denote the local point l returned by the oracle by l i .
3. Find the convex combination q i+1 of q i and l i that minimizes the distance q − q i+1 .

Let i = i + 1 and go to
Step 2 until the distance q − q i ≤ ǫ. Return q ǫ ≡ q i .
Note that the distance q − q i is a decreasing function of i, and actually, when q happens to lie inside L, the algorithm is guaranteed to stop after a number of steps O(1/ǫ 2 ) [25]. Since maximizing the overlap ( q − q i ) · l over all local vectors l is a NP-hard problem, in Step 2 we must make use of a heuristic method, described in Appendix A.
Analytic lower bound for v c . We now discuss explicitly the procedure we implemented in order to obtain the new bound 682/1000 on v c = 1/K G (3). It is important to note that, while our procedure is based on implementing on a computer the above methods (hence giving a numerical result), the final result is proven analytically. This is done as follows.
The finite set of measurement settings we use is based on a family of polyhedra parameterized by an integer n which results in m = n 2 (m = n 2 − n + 1) vertices in case of n odd (even). The shrinking factor of this polyhedron is given by η = cos 2 (π/2n). Both the construction of the polyhedra in terms of unit vectorsû(i 1 , i 2 ) and the proof regarding the value of the shrinking factor can be found in Appendix B. We could implement the calculation up to n = 25 (i.e. m = n 2 = 625 settings) for which we find the lower bound (2).
We set the initial visibility of the Werner state to v 0 = 689/1000. Combined with the above m = 625 measurement settings (for both Alice and Bob), we obtain the target quantum point q.
After 23 × 10 6 iterations of the algorithm, which was completed on a standard desktop PC within a week, we get numerically a point q ǫ such that q − q ǫ ≤ 9.8484 × 10 −6 . We then truncate q ǫ up to k = 16 digits, which results in the rational point q r . Note that q r is now local by construction; see Appendix C. We now have that where q junk takes care of the (small) difference between the analytical points q and q r . Let us now slightly shrink the point q towards the centre of the local correlation polytope (the origin) by rescaling q with a factor ν close to (but strictly smaller than) 1: Clearly, we see that ν q is provenly local if the point x = ν q junk /(1 − ν) is local as well. By rearranging (13), we have that where each entry of x has an analytical form. Note that all the components of x are expected to be small (in norm), since the points q and q r are very close (note that ν will be chosen such that the factor ν/(1 − ν) is not too large). In fact, it can be proven that the point x is local, using the following result: The proof, as well as more details on this analysis, can be found in Appendix C.
Next, setting ν = 999/1000 we obtain the bound implying that the point q is local via the above lemma.
To summarize, we obtain that the following lower bound for the critical visibility v c ≥ η 2 νv 0 > 0.682 (16) where η = cos 2 (π/50) is the shrinking factor of the polyhedron. We provide a Mathematica file which gives the points q and q r and checks the validity of condition (15), as well as a file containing all the data for the proof. This material is available online [29].

LHV MODEL FOR POVMS
We also provide a better LHV model for Werner states considering arbitrary local measurements, i.e. moving from projective measurements to general POVMs. Specifically, we give a model for v = 341/750 ≃ 0.4547. This improves on a previous model of Barrett that reached v = 5/12 ≃ 0.4167.
The construction of the model is based on the following argument. Essentially, any noisy qubit POVM can be expressed as a convex combination involving only projective qubit measurements, given the amount of white noise is above a certain threshold µ. Therefore, if the statistics of certain Werner states ρ(v) admit a LHV model, then so do the statistics of noisy POVMs (given the amount of noise is above the threshold µ). Again, this follows from the linearity of the trace rule. In turn this implies that the statistics of arbitrary (noiseless) POVMs on the slightly more noisy Werner state ρ(µ 2 v) is local.
More formally, we can make the following statement. Lemma 2. Any noisy qubit POVM M (µ) with elements {M i (µ)} i=1,...,4 proportional to rank-1 projectors for µ = 2 3 − ǫ can be written as a convex sum of rank-1 projectors, where ǫ may be arbitrary close to zero.
The proof is deferred to Appendix D. The above value µ = 2/3 along with our lower bound v = 682/1000 for v c implies the lower bound of the visibility µ 2 v = (2/3)(682/1000) = 341/750 for a LHV model for POVMs. We also refer to an independent related work [30], in particular for an alternative proof of Lemma 2.

DISCUSSION
We have presented better LHV models for Werner states, as well as a new upper bound on the Grothendieck constant of order 3. The methods we develop provide analytical bounds, which will converge to the exact value of v c (and K G (3)) using increased computational power.
Clearly, these methods can be applied to construct LHV models for other classes of entangled states, in particular in higher Hilbert space dimensions. It would also be interesting to adapt the present technique to construct local hidden state models, a specific class of LHV models relevant in the context of quantum steering [7]. Finally, these methods could also be used to obtain bounds on other Grothendieck constants. While this looks computationally challenging at first sight, taking advantage of symmetry arguments could lead to progress.
Note added. In a related work, the authors of Ref. [30] also presented a better LHV model for Werner states for POVMs, achieving a visibility similar to ours.

APPENDIX A. DESCRIPTION OF THE HEURISTIC ORACLE
The oracle returning l i in maximizing the overlap over all l ∈ L is a heuristic one. It is first noted that it is enough to maximize over all vertices D λ of the set L, since L is a polytope. The objective S in Eq. (17) for a given strategy λ = (a 1 , a 2 , . . . , a m , b 1 , b 2 , . . . , b m ) (and corresponding vertex D λ ) is as follows: The number of vertices (and different λ strategies) is 2 2m , hence evaluating S λ for all λ and picking the biggest one is clearly not tractable in our range of m > 600. Therefore, instead of a brute force computation we have to resort to a heuristic method. Note that a heuristic method still suffices, since the intuition behind l i is that it gives a direction for q i to move towards a better point. The iterative algorithm is as follows.
1. Choose randomly assignments {a x ∈ ±1} x for the deterministic strategy. y)) a x > 0, otherwise to setting b y = −1 for all y = 1, . . . , m. 3. Fixing {b y } y , maximize S in function of a x . This amounts to setting a x = +1 if y)) b y > 0, otherwise to setting a x = −1 for all x = 1, . . . , m.

Go back to step 2 until convergence of S is reached.
Note that in each iteration step the value of the objective S is guaranteed not to decrease. However, the algorithm may easily get stuck in a non-optimal S. To make the iterative procedure more efficient, we run it several times (say 100 times) starting from different random seeds, and pick the solution λ and corresponding vertex D λ with the biggest value of S.

APPENDIX B. A FAMILY OF POLYHEDRA
The polyhedra are parameterized by an integer n. In case of n odd, the verticesû(i 1 , i 2 ) are given bŷ u(i 1 , i 2 ) = cos i 1 π n cos i 2 π n , sin i 1 π n cos i 2 π n , sin i 2 π n , (19) where i 1 , i 2 ∈ {0, 1, . . . , n−1} plus their antipodal points. This amounts to m = n 2 measurement settings. In case of n even, the number of vertices up to inversion (i.e. the number of settings) are m = n 2 − n + 1 due to possible redundancy of some of the vertices.
It is easy to derive an analytical expression for the shrinking factor associated with this polyhedron, which is given by η = cos 2 (π/2n). Indeed, let us fix i 1 , in which case the vertices ±û(i 1 , i 2 ), i 2 = (0, . . . , n − 1) define a regular 2n-gon in the two-dimensional plane with a planar shrinking factor of η 2d = cos(π/2n). Any pointû on the unit sphere can be written as a convex combination of two neighboring planes defined by some i 1 ∈ {0, 1, . . . , n − 1} and i 1 + 1 (mod n). In the worst case, the pointû lies just midway between these two planes, in which case we get the lower bound of η = η 2d cos(π/2n) = cos 2 (π/2n) on the 3-dimensional shrinking factor.

APPENDIX C. GOING FROM NUMERICAL TO EXACT PRECISION
The algorithm described in the main text does not provide us precisely the point q but only a point q ǫ , which is very close to q, say, Note that q ǫ = λ w λ D λ , where the (positive) weights w λ coming from the algorithm are given in double precision format, whereas D λ are deterministic strategies with ±1 entries. In order to provide an analytical proof, we first transform the positive floating-point numbers w λ to positive rationals w r λ . To this end, we use the truncation where k denotes the number of figures kept behind the decimal point. Then we renormalize and obtain the rational point (22) which is local by construction. Then we can write where q junk takes care of the (small) difference between the analytical points q and q r . Let us now slightly shrink the point q towards the centre of the local polytope by rescaling q with a factor ν smaller than 1: From the right-hand side expression, it is clear that ν q is provenly local if the point x = ν q junk /(1 − ν) is local as well. By rearranging (24), we have where each entry of x has an analytical form. Moreover, due to equation (20), the entries of x are typically small in case of small ǫ, and ν not extremely close to 1. This suggests an easy test to decide if x is local. Namely, Proof. The proof is based on an explicit decomposition in terms of local points ± E i,j and E 0 , where all entries of the point E i,j are zero but entry (i, j) where it takes up +1. On the other hand, E 0 stands for the distribution with all entries zero. From the positivity of the weights in the above decomposition, it follows that z admits a LHV model if i,j |z ij | < 1 as claimed in Lemma 2. Note also that i,j |z ij | < 1 entails that all entries are bounded by ±1, hence such a z is a valid correlation point by definition.

APPENDIX D. DECOMPOSING NOISY QUBIT POVMS IN TERMS OF PROJECTORS
An extremal POVM M for qubits can be characterized as follows. It has no more than four elements {M i }, i = 1, 2, 3, 4 such that each element M i is proportional to a rank 1 projector [31]. Let us define the vector of Pauli matrices σ = (σ x , σ y , σ z ) and write the POVM elements in the form where a i = | a i |, i a i = 1, and i a i = 0. Similarly, we define the elements of a noisy POVM as follows where 0 ≤ µ ≤ 1. Note that µ = 1 corresponds to the noiseless case. Also an arbitrary qubit rank 1 projector P can be described with the following two elements where b is a unit vector.
With these definitions, we state our Lemma 1 in the main text: Any noisy qubit POVM M (µ) with elements {M i (µ)} i=1,...,4 proportional with rank 1 projectors for µ = 2 3 − ǫ can be written as a convex sum of rank 1 projectors P (k) , where ǫ may be arbitrary close to zero and k may run up to infinity.
In order to prove it, we start with the following lemma.
where p is strictly larger than zero if ǫ > 0. Note that the above lemma already provides us a constructive method to prove Lemma 1: We start with M (µ) and use Lemma 3 to decompose it as with p 1 > 0. Then we run the protocol again starting with M (1) (µ) to get the decomposition (33) Running this iterative procedure up to n times, we get a decomposition of M (µ) in terms of n projectors P (k) , k = 1, 2, . . . , n and a POVM M (n) (µ) with an overall weight of n k=1 (1 − p k ) of this POVM. If we were demanding each p k to be maximal at each step k, then, as n goes to infinity, this weight must go to zero. For, if it did not, then there would be a subsequence of (M (k) (µ)) k converging to some POVMM (µ) with the property that it cannot be decomposed further, a contradiction. Therefore one arrives at a decomposition of M (µ) only in terms of projectors.
We are now left with a proof of Lemma 3. To this end, we state another lemma.
Lemma 4. Given four nonzero vectors a i , i = 1, 2, 3, 4 in the three-dimensional Euclidean space such that they sum up to zero. Then the relation holds true for at least one pair (say, i and j), where we defined a i = | a i |. In other words, one can always pick two vectors for which the angle θ ij between them is at least θ ij = arccos(−1/3). Note the special case of the vertices of the regular tetrahedron for which each angle θ ij between the vectors formed by the vertices is arccos(−1/3).
Proof. The proof is by contradiction. Suppose that the lemma is not true and we can pick vectors a i such that a 1 · a 4 > −a 1 a 4 /3 a 2 · a 4 > −a 2 a 4 /3 a 3 · a 4 > −a 3 a 4 /3.
Then summing up the above three equations, and plugging a 4 = − a 1 − a 2 − a 3 , we get (a 1 + a 2 + a 3 ) > 3a 4 . If we choose for instance the ordering a 4 ≥ a 3 ≥ a 2 ≥ a 1 , the above relation is clearly not true.
With the above tools, we are ready to prove Lemma 3 which in turn proves Lemma 1. To this end we write out equation (30) for each POVM element and we assume w.l.o.g. that the (i, j) pair satisfying equation (34) is given by (i, j) = (1, 2). Then we have for the scalar terms and for the vectors, where the other POVM M ′ (µ) in (30) is defined by the elements M ′ i (µ) = c i + µ c i · σ. Let us denote the angle between a 1 and a 2 by θ 12 , which we write in terms of two positive angles θ 1 , θ 2 as follows θ 12 = θ 1 + θ 2 which are yet to be determined. After an appropriate rotation of the coordinate system, we can use the following parametrization and we can also assume w.l.o.g. that a 1 ≥ a 2 .
Using the first equation of (37), we separate the e x and e z terms which result in two equations where we defined c i = c ix e x + c iz e z for i = 1, 2. Combining (39) with the first equations of (36) and (37), p can be expressed as follows: Similarly, from the second equations of (36) and (37), we arrive at p = 4a 2 µ 1 − µ 2 (sin θ 2 − µ).
Given θ 12 , a 1 ≥ a 2 > 0 (which define M (µ)), our goal is to find a p strictly greater than 0. Note that due to Lemma 4 we can also assume that θ 12 ≥ arccos(−1/3). Note also that a 1 = 0 entails a 2 = 0, which means that M (µ) is already a projector and the proof can be finished. On the other hand, if a 2 = 0 we go back to the case of a three-outcome POVM which has to be treated similarly to the general four-outcome situation and will be discussed later.
Let us split the general case a 1 ≥ a 2 into a 1 = a 2 and a 1 > a 2 .
Altogether we obtain the result that in case of a 1 ≥ a 2 > 0, p is strictly larger than zero whenever µ crit = 2/3 − ǫ, where ǫ can be arbitrary small.
Let us now come back to the situation when a 2 = 0. In that case, we get the very same equations for p as in Eqs. (40,41) with the only exception that instead of Lemma 4, we have Lemma 5. Given three nonzero vectors a i , i = 1, 2, 3 in the three-dimensional Euclidean space such that they sum up to zero. Then the relation a i · a j a i a j ≤ − 1 2 holds true for at least one pair (say, i and j), where we defined a i = | a i |.
The proof is analogous to the one of Lemma 4. However, the new condition in (43) implies the even larger bound of µ crit = √ 3/2. Hence, the case a 2 = 0 is solved as well. As a side result, we also have a theorem for 3-outcome POVMs: Lemma 6. Any noisy three-outcome POVM M (µ) with elements {M i (µ)}, i = 1, 2, 3 for µ = √ 3/2 − ǫ can be written as a convex sum of projectors P (k) , where ǫ may go arbitrary close to zero and k may run up to infinity.