A refinement of Reznick’s Positivstellensatz with applications to quantum information theory

In his solution of Hilbert’s 17th problem Artin showed that any positive definite polynomial in several variables can be written as the quotient of two sums of squares. Later Reznick showed that the denominator in Artin’s result can always be chosen as an N -th power of the squared norm of the variables and gave explicit bounds on N . By using concepts from quantum information theory (such as partial traces, optimal cloning maps, and an identity due to Chiribella) we give simpler proofs and minor improvements of both real and complex versions of this result. Moreover, we discuss constructions of Hilbert identities using Gaussian integrals and we review an elementary method to construct complex spherical designs. Finally, we apply our results to give improved bounds for exponential quantum de Finetti theorems in the real and in the complex setting.

1 Introduction In the same way that the Nullstellensatz is fundamental for complex algebraic geometry, so called Positivstellensätze are important in real algebraic geometry [1,2]. A Positivstellensatz [3,4] states that a polynomial in d real variables which is non-negative on some subset of R d is related in some prescribed way to a sum of squares (SOS), which are special polynomials guaranteed by definition to be non-negative. Most of such results consider polynomials which are non-negative on semialgebraic sets (sets where a finite number of polynomials are non-negative) and other need a (strict) positivity guarantee (e.g. Schmüdgen's [5] and Putinar's [6] Positivstellensätze). In this work, we shall focus on results close to Artin's solution to Hilbert's 17th problem [7]: For any strictly positive homogeneous polynomial p in d real variables, there exist two sum-of-squares polynomials h, q such that hp = q.
In his seminal work [8], Reznick showed that h can be taken of the form h(x) = ∥x∥ 2N , giving also bounds on N , in terms of the number of variables, the degree, and a certain measure of positivity of p. We re-prove this type of results, both in the real [8] and in the complex [9] cases, using techniques from quantum information theory.
The tools from quantum information theory we employ are related mainly to the entanglement theory of symmetric, multi-partite quantum states. A great introduction to the main ideas and techniques we deploy is Harrow's preprint [10]. We also develop the parallel theory in the real case, which is less known than the complex variable case. Our main technical insight is an explicit inversion of a well-known identity due to Chiribella [11] relating three sequences of quantum maps: the measure-and-prepare maps, the partial traces, and the approximate cloning maps.
The main contribution of this work is to make precise the deep connection between Reznick-type Positivstellensätze and quantum information theory by recasting the classical proofs of the former in the linear algebraic language of the latter. As a byproduct of our careful analysis of this correspondence, we slightly improve the bounds on the exponents needed in the Positivstellensätze and in exponential de Finetti theorems, following [10].
When finishing our article we learned of the recent work by Fang and Fawzi [12] improving the convergence rates of sums-of-squares hierarchies by polynomial techniques related to Reznick's ideas (see also [13,14] for other papers analyzing the speed of convergence of SDP hierarchies for polynomial optimization). While our work is also based on these ideas, our focus is quite different. Instead of estimating when a polynomial is a sum-of-squares we are interested in the particular form of the decomposition that is central in Reznick's work. However, it would be interesting to see if the techniques of Fang and Fawzi could also lead to new results in this direction. We shall keep this question for future study.
Our paper is organized as follows. In Section 2 we introduce the correspondence between bi-Hermitian homogeneous multi-variable polynomials and Hermitian operators acting on the symmetric subspace of a tensor power, emphasizing the direct correspondence between analytical and algebraic operations. Sections 3 and 4 contain the proofs of the complex, resp. real Positivstellensätze. In Section 5 we discuss exponential de Finetti theorems. The Appendices contain results on Hilbert identities and complex spherical designs used in the proofs.

Preliminaries
In this section we set the stage for the proof of our main result, the complex Positivstellensatz in Theorem 3.1. We do so by discussing the folklore connection between bi-Hermitian forms and Hermitian operators acting on the symmetric subspace of a tensor power of a finite dimensional complex Hilbert space. We then relate various linear algebraic operations on these operators to natural maps on the corresponding polynomials. We equally discuss the only purely analytical tool used in this paper to establish both the complex and the real Positivstellensätze, the Bernstein inequality in Lemmas 2.5 and 2.6.
We shall denote by ∨ n C d the symmetric subspace of the tensor product (C d ) ⊗n , and by B(∨ n C d ) and H(∨ n C d ) the spaces of (bounded) linear operators and Hermitian operators respectively from ∨ n C d to itself. The space ∨ n C d is spanned by the family {x ⊗n : x ∈ C d }, see [15,Section I.5] or [10,Theorem 3]. Importantly, we denote by d[n] the dimension of the symmetric subspace: We shall use S d−1 to denote the complex unit sphere of C d , and S n to denote the permutation group on n elements. We shall also use the falling factorial notation for real x and integer p ≥ 1. We use the bra-ket notation from quantum mechanics, denoting e.g. by |x⟩⟨y| the rank-one linear operator xy * .

Polynomials and operators acting on the symmetric subspace
For any Hermitian operator W = W * ∈ H(∨ k 1 C d 1 ⊗ . . . ⊗ ∨ k l C d l ) we consider the corresponding bi-Hermitian form in the complex variables x 1 ∈ C d 1 , . . . , x l ∈ C d l , The terminology "bi-Hermitian" used above refers to the fact that the form p W , taking as input l vectors, is of homogeneous degree k i in x i and x i ; moreover, it determines the operator W uniquely. Therefore, we shall often switch between the "operator picture" involving W and the equivalent "polynomial picture" involving p W . We introduce the following notation for the extremal values of p W on the unit sphere of each of the l variable sets: An important special case we shall consider is the case l = 1, in which we often write and p W is called the Q function [16]. In this case, W is called block-positive iff m(W ) ≥ 0, that is if the corresponding polynomial has non-negative range. As a more general case we consider l = 2 and k 2 = 1, i.e. where the variables x 2 do not appear in the special tensor-product structure in p W . In this case, we furthermore denote x 2 ≡ y, and d 2 ≡ D (see Section 3); of course, this reduces to the previous case upon setting D = 1, y = 1.
For n ≥ k we denote by tr n→k : B ∨ n C d → B ∨ k C d the partial trace erasing n − k systems. In the polynomial picture the partial trace reduces to a differential operator given in terms of the Laplacian where we formally treat x i andx i for i ∈ {1, . . . , d} as independent variables.
Proof. Recall that the set {|v⟩⟨v| ⊗k : |v⟩ ∈ C d } spans H ∨ k C d (see [10,Eq. 11b]). Therefore, it will be sufficient to show the lemma for the corresponding set of bi-Hermitian forms {p |v⟩⟨v| ⊗k (x) = |⟨x|v⟩| 2k }. Note that on one hand On the other hand Direct comparison of the two expressions shows that Finally, by iterating the previous formula the statement of the lemma follows.

Spherical designs
In order to have discrete versions of our main result, the complex Positivstellensatz from Theorem 3.1, we need the following relaxation of the uniform measure on the complex unit sphere. The real case has a long history in mathematics and computer science [17], while the complex case has received a lot of interest due to applications in quantum information theory [18].
sym is the orthogonal projection on the symmetric subspace ∨ n C d ⊆ (C d ) ⊗n .
Integration with respect to a spherical n-design over a polynomial of degree at most n in φ ∈ C d and degree at most n in φ therefore yields the same result as integration with respect to the Haar measure (the unique unitarily invariant probability measure on S d−1 ), which is a spherical n-design for any n ∈ N. But whereas the Haar measure is non-atomic, there exist, for any n < ∞, discrete spherical n-designs supported on a finite number of points, so that integrals turn into finite (weighted) sums; in Appendix B we show how to construct a complex spherical n-design supported on (n + 1) 2d points. z z W n k Figure 1: Graphical representation of the correspondence between self-adjoint operators W acting on the symmetric subspace, and polynomials. From left to right, we have depicted the diagrams for p W (z), ∥z∥ 2k p W (z), and ((n + k) k ) −2 ∆ k p W (z) respectively, where ∆ is the complex Laplacian (1). This emphasized in particular that multiplying with the norm and the iterated complex Laplacian are, up to constants, dual operations.

The measure-and-prepare map
The term measure-and-prepare map comes from quantum information theory, where linear maps of a similar form are known as quantum-classical channels, see [19,Sec. 4.6.6]. Physically, they can be seen as processes where the input is measured in some (possibly over-complete) basis, and then a specific output is prepared (cf. (2)).

Definition 2.3.
For n, k ∈ N, the measure-and-prepare map MP n→k : for any X ∈ B(∨ n C d ). Here d[n + k] denotes the dimension of the symmetric subspace ∨ n+k C d and dφ denotes the Haar measure on the unit sphere in C d (or any spherical (n + k)-design, see Definition 2.2).
Note that the measure-and-prepare map is completely positive, but in general it is neither trace-preserving nor unital. To make it trace-preserving one has to multiply with the scalar d[n]/d[n + k]. For any n, k ∈ N the measure-and-prepare map has the adjoint MP * n→k = MP k→n with respect to the Hilbert-Schmidt inner product. The adjoint of the partial trace with respect to the Hilbert-Schmidt inner product on B(∨ n C d ) is given by On the level of polynomials we have The adjoint of the partial trace map is equal, up to a factor, to the so-called "cloning channel" which is the best quantum-channel approximation to a quantum cloner, mapping k copies of a quantum state to n (approximate) copies, see [20]. The measurement-and-prepare map satisfies the following remarkable identity involving partial traces and their adjoints, due to Chiribella [11,Eq. (6)] (see also [10,Theorem 7]): Theorem 2.4 (Chiribella identity [11]). For any n, k ∈ N we have Note that c(n, k, s) = c(k, n, s) and min(n,k) s=0 c(n, k, s) = 1.
For the sake of completeness we give the proof of the Chiribella identity presented in [10, Theorem 7].
Proof. For any a, b ∈ C d we have (the integral is, as usual, on the unit sphere of C d , and dφ is a (n + k)-spherical design): Above, we have used Lemma A.6 for the second equality and the definition of the projector onto the symmetric subspace as a sum of tensor-permutation matrices for the third equality. To see the fourth equality, note that, among the permutations σ ∈ S n+k , precisely n! k! k s n s = (n + k)!c(n, k, s) of them yield ⟨b ⊗k ⊗ a ⊗n |P σ |b ⊗k ⊗ a ⊗n ⟩ = ∥a∥ 2(n−s) ∥b∥ 2(n−k) |⟨a|b⟩| 2s . The theorem then follows from the fact that the set {x ⊗n : Finally, the normalization condition min(n,k) s=0 c(n, k, s) = 1 is the well-known Vandermonde identity [21,Eq. (5.22)] given by Let us point out that Theorem 2.4 will play a central role in our approach to proving real and complex Positivstellensätze; the corresponding step in the original proofs from [8] and, respectively, [9], is played by Hobson's identity [22].

Bernstein inequalities
The last ingredient we need is a Bernstein-type inequality, relating the supremum of the Laplacian of some homogeneous polynomial to the supremum of the polynomial itself. Let us first recall the result in the real case (and, for convenience of our notation, only for polynomials of even degree 2k). Lemma 2.5 (Bernstein-type inequality, real case; [8]). For any W ∈ H(∨ k R d ) we have in the real variables x ∈ R d , and ∆ R denotes the Laplacian with respect to these d real variables.
We shall need later the following complex version of this result.
Proof. The proof is based on a reduction to the real case, and the fact that the "complex Laplacian" can be expressed in terms of a real Laplacian (depending on real and imaginary parts), as follows. Consider a polynomial q = q(z,z) = s,t≥0 q st z szt . Its (complex) Laplacian reads Writing now z = a + ib, with a, b ∈ R, and taking the "real Laplacian" of q with respect to a, b, we have Taking the partial derivatives in q(a + ib, a − ib), we obtain a relation which extends trivially to several complex variables. Going back to our complex polynomial p, we see it as a homogeneous polynomial of degree 2k in 2d real variables. Applying Lemma 2.5, we obtain It would be interesting to obtain a tighter Bernstein inequality in the complex case, without using the real Bernstein inequality.

A Positivstellensatz for complex Hermitian bi-homogeneous polynomials
The following theorem is the main contribution of our paper.
Then, for any positive integer n ≥ k such that we have, for all x ∈ C d and y ∈ C D , where pW (φ, y) is a bi-homogeneous Hermitian form of bi-degree k in φ,φ and bi-degree 1 in y,ȳ, satisfying pW (φ, y) ≥ 0 for all φ ∈ C d and y ∈ C D , and explicitly computable in terms of W . Here, the measure dφ can be any (n + k)-design (see Definition 2.2) showing that ∥x∥ 2(n−k) p W (x, y) is a sum of squares. In the case k = 1, the bound (7) can be improved to Note that our main theorem covers a more general case than the one in [8]: the polynomials we consider have a set of extra D variables, in the spirit of Quillen's result from [23]; we refer the reader to the Introduction for historical considerations.
Before we prove Theorem 3.1, let us introduce one of the main technical ingredients we shall use. On B(∨ k C d ), we define the linear map By the Chiribella identity (see Theorem 2.4), this map is closely related to the measurementand-prepare map MP n→k introduced in (2). In fact we have, for n ≥ k, One of our main technical observations is that the map Φ (n) k→k has a particularly nice, explicit, compositional inverse: Then, we have on Proof. Since the map tr * k→n is injective (as, for n ≥ k, its adjoint tr n→k is surjective) and the map Φ (n) k→k is selfadjoint, the claim is equivalent to showing or, by taking adjoints, to the following equality of linear maps on which we are focusing next. We use the same idea from the proof of Theorem 2.4: the equality above holds if and only if, when applying the maps to the element |α⟩⟨α| ⊗n and taking the scalar product with |β⟩⟨β| ⊗k , we obtain identical results, for all unit norm α, β ∈ C d . Letting x = |⟨α, β⟩| 2 ∈ [0, 1], we obtain, for the left hand side (using ∥α∥ = ∥β∥ = 1) ⟨β ⊗k | tr n→k (|α⟩⟨α| ⊗n )|β ⊗k ⟩ = ⟨β ⊗k |α ⊗k ⟩⟨α ⊗k |β ⊗k ⟩ = x k .
For the right hand side of (12), denoting we obtain (see the proof of Theorem 2.4 for the combinatorics in the penultimate line): We now compute the inner sum for each s = 0, . . . , k separately. Simple algebraic manipulations and the substitution t This shows that, for each α, β, both sides of (12) evaluate to the same quantity, namely x k , finishing the proof.
We have now all the ingredients to give the proof of our main result.
Proof of Theorem 3.1. Using Lemma 3.2 and the adjoint of (10) we have the following equality where the last equality holds since dφ is a (n + k)-design. Applying tr * k→n ⊗ id D to W and going to the polynomial picture gives thus the following equality (see also Eq. (4)): where we have set (note the explicit dependence ofW , and hence of pW , on the input data . To conclude, we need to determine when pW is a positive polynomial. To this end, we insert the expansion of Ψ (n) k→k from Lemma 3.2. This leads to where we used Lemma 2.1 in the last step; note that the (complex) Laplacian acts only on the first set of variables (corresponding to φ). Note also that for t = k, the corresponding summand contains p W , and the coefficient q(n, k, k) = (n + k)!(n − k)!(n!) −2 is positive. Using Lemma 2.6 to upper bound the absolute values of the remaining summands leads to (for ∥φ∥ = ∥y∥ = 1): which, after computing q(n, 1, 1) = (n + 1)/n and q(n, 1, 0) = −(n + 1)/(n(n + d)), yields (9).
For general k ≥ 1, we bound the negative term in the previous expression from above by the truncation of the Taylor expansion of a certain exponential function (we borrow the idea from the proof of [8,Theorem 3.11]). To do this, using the formula for q(n, k, t) from Lemma 3.2, an elementary computation for 0 ≤ t ≤ k − 2 gives first the following t-independent upper bound: where the inequality arises by setting t = k − 1 in the previous expression, which is an increasing function in t ∈ [0, ∞); note that the choice t = k − 1 is sub-optimal, leading to slightly worse but nicer final results, see Remark 3.3. Applying this estimate repeatedly in the previous sum and changing summation order leads to Inserting this estimate in (15) shows that pW (φ, y) ≥ 0 whenever n satisfies Using the fact that the function r → (e r − 1)/r is increasing, we find that it is sufficient that n satisfies, for some Γ > 0 to be determined later, in order for pW to be non-negative. Re-arranging terms, we obtain the following two sufficient conditions: We now choose Γ such that the two inequalities are identical, and the bound (7) follows. That the bound (9) is better in the case k = 1, can be easily seen from the inequality ln(1 + m/M ) ≤ m/M .
(instead of Eq. (7)) is sufficient for the conclusions of Theorem 3.1. In other words, Eq. (17) is a better bound (a weaker requirement on n) than (7). That (17) suffices, follows as in the proof of Theorem 3.1 when using d(k − 1)(2k − 3)/(n + d + k − 2) as the upper bound in (16), which is obtained by setting t = k − 2 in the preceding expression, which is the best bound one can obtain using the monotonicity in t. The proof proceeds by replacing r := d(k − 1)(2k − 3)/(n + d + k − 2), which still has to satisfy r ≤ Γ. Setting Γ = ln 1 + (k−1)(2k−3) M (W ) , one sees that the bound for n from (17) suffices. Remark 3.4 (Less stringent bounds on n). Our bounds (17) and (9) on n are better (less stringent) than the ones from [9, Theorem 1], even by roughly a factor of 2 for the case k = 1 in Eq. (9). This is because we used a better Bernstein-type inequality in the complex case (our Lemma 2.6) than these authors.
Even less stringent bounds on n may be found, for given m(W ), M (W ), d and k, in a numerical way, namely by searching for the smallest n ≥ k such that the expression in square brackets in (15) becomes nonnegative; eqs. (9) and (17) give a guarantee for when the search has to terminate. Note that the everything in the proof after Eq. (15) was devoted merely to derive the simple analytical expressions given in (9) and (17) as sufficient bounds on n.
One can even obtain somewhat better lower bounds on pW (φ, y) than the one given in (15) using an idea of [24], and these can again be used in analytical [24,Section 3.2] or numerical ways to obtain sufficient bounds on n for a positive representation. We compare all these analytical and numerical bounds for a real version of this result in Example 4.5.
Remark 3.5. Theorem 3.1 guarantees pW (φ, y) to be nonnegative for any φ, y if only n is sufficiently big. As p W is of homogeneous degree 1 in y and y, we can write φ |y⟩| 2 for any φ, y. Inserting this into (8) and using the construction of discrete, finitely supported complex spherical designs (Appendix B), this shows constructively the existence of a sums-of-squares decomposition of the following special form (cf. (8)): where the sum over the unit vectors φ of the finite spherical design contains at most (n + k + 1) 2d terms and we have taken the weights into the normalization of the vectors w The obtained sum-of-squares decomposition is thus very special as each term is a 2n-th power of the absolute value of a linear form in x multiplied with the absolute square of a linear form in y.

The case of real polynomials
We turn now to the case of real polynomials, the classical setting of the study of positive polynomials and the corresponding Positivstellensätze [23,5,6,8]. It turns out that the proof strategy we developed for the complex case can be adapted to the real situation, yielding a small improvement of Reznick's Positivstellensatz [8,Theorem 3.12]. We shall outline the main steps below, but we now point out two important facts. The first one is that, in the real case, the objects and maps we need to set up do not have a direct interpretation in the language of quantum information (which is a theory built on the field of complex numbers). The second point is that our derivation in the real case follows closely the proof strategy from [8], once one moves from the language of polynomials to that of linear algebra; we shall emphasize which are the similarities and the (small) differences as we move on.
Let us first explain in detail the relation between the symmetric subspace and homogeneous polynomials in the real case. We denote by H n (R d ) the vector space of real homogeneous polynomials in d real variables, of degree n. To any symmetric vector v ∈ ∨ n R d we associate a homogeneous polynomial p v ∈ H n (R d ) of degree n given by Note that the correspondence between H n (R d ) and ∨ n R d introduced here is one-to-one, and we shall often switch between the "polynomial" and the "linear algebra" viewpoints. As an important example, consider the case of the square of the norm, which corresponds to the (un-normalized) maximally entangled state Ω d : Since we are interested in positive polynomials, we shall focus in what follows on the case of polynomials of even degree. By the correspondence above, a linear map F : ∨ 2n R d → ∨ 2k R d corresponds to a linear mapF : H 2n (R d ) → H 2k (R d ) and in the following we shall abuse notation by writing F (v) and F (p) for p = p v interchangeably. On the space ∨ 2n R d we can define a "partial trace" operation tr n→k : . When applied to polynomials, it turns out that the map tr n→k is, up to a constant, an iterated Laplacian: Theorem 3]. Therefore, the corresponding set of polynomials {p v ⊗2n (x) = ⟨x|v⟩ 2n } v∈R d spans H 2n (R d ) and it will be sufficient to show the lemma for this spanning set. Note that on one hand On the other hand tr n→(n−1) (v ⊗2n ) = ∥v∥ 2 v ⊗(2n−2) .
Finally, by iterating the previous formula the statement of the lemma follows.
Using the Hilbert space structure of ∨ 2n R d we can introduce the dual map of tr n→k , tr * k→n : ∨ 2k R d → ∨ 2n R d . It is easy to express this map in the "polynomial picture". For this consider v ∈ ∨ 2k R d and compute In other words, the map tr * (which is related to the cloning operation from quantum information theory in the complex setting) corresponds to multiplying a polynomial with an even power of the euclidean norm of the variable vector. We present the correspondence between symmetric tensors and homogeneous real polynomials, as well as the dual operations of multiplying with the norm and taking the Laplacian in Figure 2.
Let us now introduce the measure and prepare map in the real case: The choice of the normalization function is motivated by the following Chiribella-like identity; the proof is similar to the complex case, see Appendix A.1 for the corresponding Hilbert identity.  .
As in the complex case, we define the map This map is related to the measure-and-prepare map by the relation This is the decomposition [8, Theorem 3.7] in Reznick's work. It is based on Hobson's identity [22], which can be seen as the real, polynomial analogue of the Chiribella identity (cf. Theorem 2.4). Its compositional inverse is given by In Reznick's derivation, this corresponds to [8,Theorem 3.9]. We claim that our linear algebraic language is more elegant, but ultimately the two formulations are equivalent. The proof of Theorem 3.1 can be easily adapted to the real case, as follows, yielding a new variant of Reznick's real Positivstellensatz [8,Theorem 3.12]. To estimate the extreme values of derivatives of polynomials we shall need (real) Bernstein inequality, see Lemma 2.5; we would like to stress again that this inequality is the only analytical tool used in the proof, the rest being basic linear algebra.

Theorem 4.2 (Positivstellensatz, real case). Consider
Then for any n ≥ k such that we have ∥x∥ 2(n−k) p v (x, y) = dφpṽ(φ, y)⟨φ|x⟩ 2n (20) with pṽ(φ, y) ≥ 0 for all φ ∈ R d and y ∈ R D . Therefore, ∥x∥ 2(n−k) p v (x, y) is a sum of squares. In the case k = 1, the bound (19) can be improved to Proof. The proof idea is identical to that of Theorem 3.1, however some details and coefficient values are the different. The first difference appears in equation (14), where the polynomial associated to the partial trace of v is related to the Laplacian of v by the following formula, proven in Lemma 4.1 In the equation above, p v is a homogeneous polynomial of total degree (2k, 1) in the real variables x 1 , . . . , x d and y 1 , . . . , y D , and ∆ R is the "real" Laplacian, acting on the first set of variables x i . Using the usual Bernstein inequality (Lemma 2.5), we get . Plugging this into the expression for pṽ, we obtain the following relation (this corresponds to (15) in the complex case) (22) Exactly as in the complex case we bound the quotients of consecutive q R 's by where the last inequality comes from setting t = k − 1 in the preceeding expression (using monotonicity). Again, we employ an exponential function approximation to derive the final estimate. Here, our proof strategy diverges qualitatively from the one in [8,Theorem 3.12], yielding the small improvement discussed in Remark 4.4. We leave the computational details to the reader.
We prefer however the bound in the statement, which has a more compact form.
Remark 4.4. The result above is to be compared with Renzick's Positivstellensatz [8,Theorem 3.12], which, in our notation, gives the bound Although the two bounds have the same leading orders in d and k, our lower bound is smaller, since the constant in front of the leading term dk 2 is smaller: in the regime where m(v)/M (v) ≪ 1, the ratio between the two lower bounds is ln 2 < 1.
We would now like to discuss the bounds above in a concrete situation in order to get an idea about the optimality of the lower bounds. Example 4.5. Consider the celebrated Motzkin polynomial to which we add a positive multiple of the norm: Obviously, we have m(p ε ) = ε; using Lagrange multipliers, one can easily find M (p ε ) = ε + 4/27. In Figure 3, we compare three lower bounds on n. In the left panel, we plot the bounds from equations (19) and (24), observing that the latter performs better (here, d = k = 3). In the right panel, we compare the bound from (24) with the one obtained by working out the value of n directly from equation (22). Not that this latter bound is necessarily better, since is shortcuts the second part of the proof of Theorem 4.2. In Figure 4, we compare again the bound obtained by working out the value of n directly from equation (22) with the one obtained by asking that the coefficients of the polynomial p n,ε (x, y, z) := (x 2 + y 2 + z 2 ) n−3 p ε (x, y, z) (27) be non-negative. Indeed, if n, ε are such that (20) holds, then the [2p, 2q, 2r] coefficient of p n,ε reads (here, p + q + r = n) Hence, for each fixed n, we can find numerically the smallest constant ε n > 0 such that, for all ε ≥ ε n , all coefficients of p n,ε are non-negative. Note that all monomials of p n,ε are of the form c 2p,2q,2r x 2p y 2q z 2r for some non-negative p, q, r with p + q + r = n, see (27).  (22). The lower bounds (red horizontal bars, one for each value of n) come from requiring that all the coefficients of the polynomial p n,ε are non-negative.

Application to exponential de Finetti theorems
We show in this section how the so-called exponential de Finetti theorem [25,26] follows from the analysis of the inversion of the Chiribella identity from Lemma 3.2. A similar derivation can be found in [10,Theorem 8]; our result improves on this by having explicit constants in front of the maps, and thus achieving better error terms. The main idea here is that we want to approximate marginals of symmetric states not by states which are exactly tensor powers of pure states (as in the usual de Finetti theorem), but with states from the larger set Such states are called (k, r, d)-almost product states, and they form a larger set than the class of product states. Considering such a larger set of targets states allows for faster convergence in de Finetti-type result: one can go from linear to exponential convergence speed using this relaxation. De Finetti type theorems have found many applications in (quantum) information theory, mainly to reduce the analysis of protocols where symmetry plays an important role to that of the much simpler i.i.d. protocols [25]. The main technical insight here is that almost product states obviously lie in the ranges of the maps tr * k−s→k • MP n→k−s , for all 0 ≤ s ≤ r. This leads to the idea that one has to truncate the sum expression of the partial trace operator not only to the first term (which is the case for the Positivstellensatz in Theorem 3.1), but to the r-th term.
Then, for any 0 ≤ r ≤ k, we have the following estimate in diamond norm where the error is bounded by and whereq with q(n, k, t) as in Lemma 3.2. In particular, the k-body marginal of a n-symmetric state is ε r away (in 1-norm) from a linear combination of projections on (k, r, d)-almost product states.
Proof. The second claim follows from the first one and fact the range of the quantum channels Clone k−s→k • MP n→k−s is precisely the set of density matrices supported on the span of W r , for all 0 ≤ s ≤ r. Indeed, this is a simple consequence of the form of the channels Clone k−s→k and MP n→k−s . Regarding the main inequality, the starting point is the inverse of the Chiribella identity (12) proven in Lemma 3.2: The inequality in the statement is obtained by bounding the diamond norm of the tail of the sum above using the triangle inequality and the fact that both Clone k−s→k and MP n→k−s are quantum channels: The claimed bound on ε r follows from the geometric sum formula and the bound on the coefficients, which we show next. Compute first, using (11), where we have used the fact that both functions are decreasing on [0, r]. The s = 0 term is bounded as follows: while the general term satisfies The total error is bounded by (we use δ < 1/3) concluding the proof.
Remark 5.2. Using the same type of exponential sum estimates as in the proof of Theorem 3.1, one obtains the following bound on the error ϵ r , unconditionally on the values of the parameters d and 1 ≤ k ≤ n: where Γ is the upper incomplete Gamma function We leave the details of the calculation to the reader. |π⟩,

A.1 The real case
Proof. We find that in the computational basis where we used Wick's (or Isserlis') formula [29] in the last step and we write (r, s) ∈ π for the transpositions τ i = (r, s) such that π = τ 1 • · · · • τ n . By our assumptions on theφ the above expression simplifies to |π⟩.
In the following let S d−1 R ⊂ R d denote the real unit sphere. Lemma A.3 (Real Spherical Hilbert identity -Linear algebra form). We have We shall prove the lemma in its polynomial form. Therefore, note that {|x⟩ ⊗n : x ∈ R d } spans ∨ n R d , and that the following lemma is equivalent to the previous. To evaluate the expectation value in the last expression we shall use polar coordinates. For this we decompose |φ⟩ = r|v⟩ with radial part r = ⟨φ|φ⟩ and spherical part v : Ω → S d−1 distributed uniformly on the unit sphere. Then it is easy to see that we get Since r is given as the length of the Gaussian vector |φ⟩ we find that Finally this gives

A.2 The complex case
Analogue to the real case we can derive Hilbert identities from Gaussian integrals. This is well-known within the community of quantum information theory (see [10]). For any permutation σ ∈ S n we denote by P σ ∈ U((C d ) ⊗n ) unitary operators defined by P σ |i 1 i 2 . . . i n ⟩ = |i σ(1) i σ (2) . . . i σ(n) ⟩, on the computational basis states. The projector onto the symmetric subspace ∨ n C d ⊂ (C d ) ⊗n is given by Proof. We find that in the computational basis Denoting φ k = Z (0) k + iZ (1) k and expanding the previous expectation value we find where we used Wick's formula [29] in the second equality and we write (r, s) ∈ π for the transpositions τ i = (r, s) such that π = τ 1 • · · · • τ n . In the last step we used that real and imaginary parts of theφ i are independent (Gaussian) random variables of mean 0.
In the following let S d−1 ⊆ C d denote the complex unit sphere.
Lemma A.6 (Complex Spherical Hilbert identity-Linear algebra form). We have Again we shall prove the lemma in its polynomial form. Therefore, note that {|x⟩ ⊗n : x ∈ C d } spans ∨ n C d , and that the following lemma is equivalent to the previous. Proof. Using Lemma A.5 we find that To evaluate the expectation value in the last expression we use again polar coordinates. For this we decompose |φ⟩ = r|v⟩ with radial part r = ⟨φ|φ⟩ and spherical part v : Ω → S d−1 distributed uniformly on the (complex) unit sphere. Then it is easy to see that we get Since r is given as the length of the Gaussian vector |φ⟩ we find that Here the additional factor 1/2 n compared to the real case comes from the normalization of the Gaussian random variables appearing in the real and imaginary parts of the entries of |φ⟩. Finally this gives

B Simple complex spherical designs
In this appendix we present a simple method to construct complex spherical designs that is inspired and closely related to a method by Hausdorff [30] for the real case. We do not claim, that this method is new. In fact similar methods can be found in the literature [31,18], but we have not found a truly elementary account in the complex case. Let us start with the definition of a complex spherical design (note that in the literature, the objects introduced below are sometimes called weighted complex spherical designs).
For our construction we need the well-known family of orthogonal Laguerre polynomials defined as , for any m ∈ N. The following theorem summarizes some well-known properties of these polynomials and we refer to [32] for more details.  for any d-tuple of complex numbers y 1 , . . . , y d ∈ C. Here γ i (k) denotes the kth entry of the vector |γ i ⟩ ∈ C d from Definition B.1 in the computational basis. Note that this identity is the complex analogue of a Hilbert identity. To find coefficients γ i (j) ∈ C and weights p i ∈ R + satisfying this identity we need the following lemma.
where we used the elementary integral where we used that Note that deg(Q l ) = 2m − 2 and that Q l (x) ≥ 0 for any x ∈ R. Using (31) it follows that As (L ′ m (β l )) 2 > 0 (since L m has no degenerate zeros) it follows that w l ≥ 0 for any l ∈ {1, . . . , m}.
Using the previous lemma we can explicitly construct a complex spherical design: Proof. Consider the weights w j ≥ 0 and complex numbers α j ∈ C for any j ∈ {1, . . . , (m + 1) 2 } constructed in Lemma B.3 such that m 2 j=1 w j α j k α l j = k!, for k = l 0, else.
The previous computation verifies (29) for the vectors and weights constructed previously. Therefore, (28) holds.