Continuous majorization in quantum phase space

We explore the role of majorization theory in quantum phase space. To this purpose, we restrict ourselves to quantum states with positive Wigner functions and show that the continuous version of majorization theory provides an elegant and very natural approach to exploring the information-theoretic properties of Wigner functions in phase space. After identifying all Gaussian pure states as equivalent in the precise sense of continuous majorization, which can be understood in light of Hudson's theorem, we conjecture a fundamental majorization relation: any positive Wigner function is majorized by the Wigner function of a Gaussian pure state (especially, the bosonic vacuum state or ground state of the harmonic oscillator). As a consequence, any Schur-concave function of the Wigner function is lower bounded by the value it takes for the vacuum state. This implies in turn that the Wigner entropy is lower bounded by its value for the vacuum state, while the converse is notably not true. Our main result is then to prove this fundamental majorization relation for a relevant subset of Wigner-positive quantum states which are mixtures of the three lowest eigenstates of the harmonic oscillator. Beyond that, the conjecture is also supported by numerical evidence. We conclude by discussing some implications of this conjecture in the context of entropic uncertainty relations in phase space.


Introduction
Majorization is an elegant and powerful algebraic theory that provides a means for comparing probability distributions in terms of disorder or randomness [1,2]. It has been extensively employed during the last century in various fields such as mathematics [3], economics [4], or information theory [5] where it can be used to derive inequalities for a variety of information-theoretic quantities such as entropies, see e.g. [6,7] for recent works. Although its deep connections with unitary matrices had long been understood [8], it is only more recently that majorization relations have been found to arise in quantum physics [9]. As such, it finds application in the study of entanglement transformations [9,10], in the discrimination of (distillable) entangled states [11,12], in the derivation of quantum uncertainty relations [13][14][15], or via so-called thermo-majorization in the context of quantum thermodynamics [16], to cite a few. In the last years, majorization theory has also proven especially useful in the framework of bosonic quantum systems, which are our interest here, serving as an instrumental tool for the investigation of entropic inequalities that are paramount in the computation of the optimal communication rates of quantum communication systems [17][18][19][20][21].
While majorization theory can be applied to both discrete and continuous probability distributions, the overwhelming majority of its applications in the existing literature concerns the former. In particular, while discrete majorization (the branch of the theory of majorization dealing with discrete probability distributions) has been the subject of numerous works in quantum information theory, notably including those dealing with bosonic quantum systems, continuous majorization (the branch concerned with contin-uous probability densities) has never been applied in this context. Nevertheless, the phasespace formulation of quantum mechanics, which leads to a characterization of quantum states in terms of continuous distributions [22], hints at the great potential of continuous majorization in such a framework. This is especially highlighted by the strong connection between majorization and entropies (both classical and quantum) coupled with the great interest directed towards continuous entropic uncertainty relations in recent years [23,24].
In the present paper, we argue and demonstrate that the theory of continuous majorization is highly relevant in the context of quantum physics. It provides a perfect solution in order to compare quantum phase-space distributions in terms of intrinsic disorder. It then also provides a natural means to address entropic properties in phase space, hence suggesting a fresh new perspective to entropic uncertainty relations. Consider a quantum system characterized by the usual pair of canonically-conjugate continuous variables x and p. A pure state of the system is described by the complex wave functions ψ(x) or ϕ(p), which are related by a Fourier transform. The operatorsx andp associated with the observables x and p obey the canonical commutation relation [x,p] = i . The observables x and p may designate the position and momentum variables, or also for instance two canonically-conjugate quadratures of the electromagnetic field (we will adopt this quantum optics language in the rest of this paper although our results and conclusions hold true in general for any canonical pair). The phase-space representation of a pure state is embodied by its Wigner function, which is a two-dimensional continuous function defined as [22] W (x, p) = 1 π e 2ipy/ ψ * (x + y) ψ (x − y) dy.
(1) The Wigner function can in many ways be thought of as a joint probability distribution of the variables x and p. For instance, the probability densities of x and p, respectively ρ x (x) = |ψ(x)| 2 and ρ p (p) = |ϕ(p)| 2 , can be retrieved from the marginal distributions of the Wigner function as ρ x (x) = W (x, p) dp and ρ p (p) = W (x, p) dx. However, the non-commutativity of operatorsx andp has deep implications re-garding the existence of such a joint distribution as quantum mechanics forbids simultaneously fixing the values of two non-commuting variables. In the phase-space description of quantum states, this translates into the fact that Wigner functions may take negative values in general, making them so-called quasi-probability distributions.
The necessary existence of negative Wigner functions in quantum mechanics can also be justified with a simple argument relying on the overlap formula, which reads [22] | ψ 1 |ψ 2 | 2 = 2π W 1 (x, p) W 2 (x, p) dx dp, (2) where W 1 and W 2 are the Wigner functions associated respectively with ψ 1 and ψ 2 . Having in mind that there exist pure states with Wigner functions positive everywhere, namely Gaussian states (i.e., states whose Wigner function is a Gaussian distribution [25]), it follows that any pure state that is orthogonal to a Gaussian pure state must have a Wigner function that takes negative values. In what follows, we will qualify a state with a positive Wigner function as Wignerpositive and refer to the corresponding property as Wigner-positivity [26]. Note that Wignerpositivity is a particular case of η-positivity for η = 0 [27,28]. We will restrict our analysis in this paper to Wigner-positive states and explore whether majorization theory can be applied to their Wigner functions. In fact, the set of pure Wigner-positive states is well understood: Hudson's theorem [29,30] establishes that a pure state is Wigner-positive if and only if it is a Gaussian pure state. Furthermore, all Gaussian pure states happen to be related to each other through Gaussian unitaries in state space, which are associated with symplectic transformations in phase space [25]. For a single mode, these affine transformations are simply combinations of displacements, rotations and squeezings of the Wigner function. Crucially for what follows, these symplectic transformations have the property that they preserve areas in phase space. The concept of area in phase space can be related to the notion of level-function [2], which happens to be at the core of the theory of continuous majorization. For a given (positive) Wigner function W (x, p), the level-function associates to a value t the area of the region in phase space that has W (x, p) greater than t. In what follows, any two (positive) Wigner functions that have the same level-functions will be called levelequivalent. Area-preserving transformations are transformations that keep the level-function unchanged, hence the Wigner functions of all Gaussian pure states are level-equivalent. This leads to the following property, which can be viewed as a corollary of Hudson's theorem: Any pure Wigner-positive state has a Wigner function that is level-equivalent to that of a Gaussian pure state.
Note that all level-equivalent functions to the Wigner function of a Gaussian pure state do not necessarily describe a physical Wigner function (that is, corresponding to a positive semidefinite density operator); but if it is physical, then we know that it corresponds to a Wigner-positive pure state according to Hudson's theorem.
This begs the question whether the above approach building on level-functions can be generalized to mixed quantum states. A mixed statê ρ = p i |ψ i ψ i | is a statistical mixture of pure states and its Wigner function W = p i W i is the corresponding mixture of their Wigner functions. It is straightforward to see that there exist non-Gaussian mixed states characterized by positive Wigner functions, a simple way to construct such a state being to form a non-Gaussian mixture of Gaussian pure states. A natural question is then whether one can characterize the full set of Wigner-positive mixed states. Such a problem happens to be difficult and remains only very partially solved today [28,[31][32][33], in the sense that no satisfying extension of Hudson's theorem to mixed states has been stated in the literature.
In the present work, we apply the theory of majorization to positive Wigner functions in order to tackle the aforementioned problem. While we do not provide a complete description of the full set of Wigner-positive states, we offer a way to compare such states in terms of continuous majorization. Continuous majorization generalizes the concept of level-equivalent functions by allowing one to compare functions that are not level-equivalent. Intuitively, for two functions f and g, the relation f majorizes g (in the continuous sense) means that f is more narrow (i.e., more ordered) than g, while two functions that are level-equivalent majorize each others. This leads us to conjecture the following generalized statement: Any mixed Wigner-positive state has a Wigner function majorized by that of a Gaussian pure state.
While the above statement may seem natural as one would expect a mixed state to be more disordered than a pure state, it is important to stress that the notion of disorder in the statement refers to distributions in phase space. It cannot be related in any simple way to disorder in state space, which concerns density operators. As a consequence, proving this conjecture is far from straightforward.
The objective of this paper is to develop the theory of continuous majorization in the framework of quantum physics and then prove the above statement for some carefully chosen positive Wigner functions. In Section 2, we begin by introducing the theory of continuous majorization, as this is not a standard topic for physicists. This then provides us with the proper mathematical tools to apply continuous majorization to quantum phase space in Section 3, before giving a proof of our conjecture restricted to a subset of Wigner-positive states. Finally, we discuss our results in connection with the notion of Wigner entropy [26] and apply them in the context of entropic uncertainty relations in Section 4.

Continuous majorization
We are now going to lay out the basics of majorization theory in the context of continuous probability distributions, giving rigorous definitions for the concepts mentioned in the introduction. We consider n-dimensional continuous non-negative and integrable distributions on R n or R + , depending on the situation. The distribution f : A → R + , where A can be R n (for any positive integer n) or R + , is a genuine probability distribution if Hereafter, we omit the bounds in integrals as long as the integration is performed on the whole domain A, as in the normalization identity in Eq. (3). Note that the distributions are normalized to 1 but the definitions and properties we show in what follows still hold if the normalization constant is different (provided it is the same for all functions we consider). Furthermore, while we consider functions f defined on an infinite domain A for our purposes here, in which case they have to be non-negative everywhere, note that majorization can also be applied to partially negative functions if they are defined on a finite domain [2]. As we mentioned in the introduction, a core element of the theory of majorization is the levelfunction, which we now rigorously define.

Definition 1. The level-function
where ν stands for the Lebesgue measure.
Basically, the level-function m f is the size of the subdomain of f that contains elements whose corresponding image under f is higher than t. Figure 1 provides an example of the level-function of a two-dimensional distribution. As we are going to show, all information needed to compare two distributions using a majorization relation (from here onwards, we write majorization for continuous majorization) is enclosed in their level-functions. When f and g are levelequivalent, meaning they have the same levelfunctions m f (t) = m g (t) for all t, we use the notation f ≡ g. While level-equivalent distributions can have very different shapes, they nonetheless are comparable in various ways. In order to see this, consider any function ϕ defined on R + and obeying some general conditions (see [34]), which can be turned into a functional φ acting on the probability distribution f as Since the integral is carried out over the whole domain, φ is invariant when parts of the domain are "swapped". It then only depends on the "size" of the domain associated to each value taken by the distribution f . This is precisely the information carried by the level-function, which simply corresponds for each value of t to the choice which is nothing else but Shannon's differential entropy of the probability distribution f . Therefore, two level-equivalent distributions have the same Shannon entropy; they are comparable in their randomness content. For a given level-function m f , one can build an infinite number of level-equivalent distributions whose level-functions are given by m f . Among the set of distributions sharing the same levelfunction, the so-called decreasing rearrangement plays a prominent role in majorization theory. It is defined as follows [2].

Definition 2. The decreasing rearrangement f ↓ of a function f defined on a domain A is the unique function defined on the same domain A that is radial-decreasing and level-equivalent to f .
An n-dimensional distribution f is radial if it can be written as a function that only depends on the norm of its argument, i.e., when r 1 < r 2 . Note here that the term decreasing is used instead of non-increasing when referring to such rearrangements in the literature. The decreasing rearrangement f ↓ of a distribution f takes its maximum value at the origin and decreases monotonically as it gets farther from it. It is easy to understand that it only depends on the level-function m f of f . Moreover, f ↓ is the same for any distribution that is level-equivalent to f , as it is unique. Examples of decreasing rearrangements are pictured in the lower part of Fig. 2.
We are now in position to define the majorization relation between two distributions. [35,36]). Let f and g be two probability distributions defined on the same domain A. The distribution f majorizes the distribution g, written f g, if and only if r ≤s

Definition 3 (Continuous majorization
with equality when s tends to infinity. The equality when s tends to infinity imposes that f and g be normalized to the same value. When the inequalities in Eq. (7) are reversed, we say that f is majorized by g, written f ≺ g. If both f g and f ≺ g hold, then f and g are level-equivalent and the inequalities in Eq. (7) become equalities. If neither f g nor f ≺ g holds, we say that f and g are incomparable.
It is useful in practice to define the objects appearing on both sides of the inequalities in Eq. (7), mainly for ease of notations. The cumulative integral of a distribution f is the function It can be understood as the highest value that can be obtained by integrating any level-equivalent function to f over a ball of radius s. Equivalently, it is the highest value of the integral of f over any part of the domain (possibly made of noncontiguous regions) that has a volume equal to that of a ball of radius s. The majorization relation f g then holds if and only if S f (s) ≥ S g (s) for all s ≥ 0, with equality when s tends to infinity.
Since Definition 3 refers to the decreasing rearrangements f ↓ and g ↓ , which are defined based on the level-functions m f and m g , it is clear that the majorization relation f g solely depends on the level-functions of f and g. Unlike in the discrete case, the decreasing rearrangement of a function is often hard to compute, which makes Definition 3 difficult to use. There are, however, In the first row, f 1 and f 2 are two examples of onedimensional probability distributions respectively defined on R and R + . They both have the same level-functions m f1 and m f2 , which are pictured in the second row. They, however, have different decreasing rearrangements f ↓ 1 and f ↓ 2 , as shown on the last row. The red segments represent the domain that corresponds to a given value of t and therefore have the same length in the different plots. Note that for a distribution defined on R + such as f 2 , the decreasing rearrangement f ↓ 2 is simply the inverse function of the level-function m f2 . equivalent statements that are less cumbersome. We introduce two such statements hereafter as Propositions 1 and 2. We point the interested reader to References [2,34] for proofs and details of these propositions.

Proposition 1. Let f and g be two probability distributions defined on the same domain
with equality when t = 0.
The notation [·] + is such that [z] + = z if z ≥ 0 and [z] + = 0 otherwise. Note that the function x → [x − t] + acting on f (r) and g(r) in Eq. (9) can be viewed as a special case of the function ϕ(x) used in Eq. (5). Hence, Proposition 1 is again solely characterized by the level-functions of f and g.
Proposition 1 is useful to prove a property that we will later need, namely that if f g 1 and f g 2 , then f majorizes any convex combination of g 1 and g 2 , i.e., with 0 ≤ λ ≤ 1. Using the fact that the function Equation (10) then follows from integrating this inequality over r and using Proposition 1.

Proposition 2. Consider two probability distributions f and g defined on the same domain A.
We have that f g if and only if ϕ (f (r)) dr ≥ ϕ (g(r)) dr (11) holds for all continuous convex functions ϕ : for which the integrals exist on both sides (see [2], p. 607).
Proposition 2 is particularly useful as Eq. (11) implies inequalities on quantities that are functionals of f and g written in the form of Eq. (5).
Thus, for any such functional φ, the majorization relation f g implies that φ(f ) ≥ φ(g). One such functional is (up to a sign) the Shannon's differential entropy, Eq. (6), which is traditionally denoted h(f ). As a consequence, if f g, then h(f ) ≤ h(g), which is consistent with the idea that f is more ordered than g, so it has a lower entropy.
As a matter of fact, the inequalities (11) hold for an even more general set of functionals beyond the special form (5), namely the so-called Schurconvex functions (we call them functions here, instead of functionals). This is actually the definition of the set of Schur-convex functions [37], which have been defined for discrete probability distributions through a discrete majorization relation but can be equivalently defined in the continuous case as follows [2]: Definition 4 (Schur-convex functions). A realvalued function φ defined on the set of probability distributions on A is called Schur-convex if, for any pair of probability distributions f and g defined on A, we have The function φ is said to be Schur-concave if the opposite inequality holds.
It is easy to see that Schur-convex functions take the same value for level-equivalent distributions. Furthermore, any real-valued function φ defined on the set of probability distributions on A that is convex and takes the same values for levelequivalent distributions in A is necessarily Schurconvex [38]. For instance, any functional φ of the form of (5) is Schur-convex, provided that ϕ is convex. The opposite, however, is not necessarily true. This is illustrated, for instance, with the Rényi entropy [39], where the parameter α can be chosen such that 0 < α < 1 or α > 1. Indeed, Rényi entropies can be proven to be concave only for α ∈ [0, 1), while they are always Schur-concave [40]. In the limit where α → 1, the Rényi entropy reduces to Shannon's differential entropy, that is, Regardless of the value of α, the Rényi entropies can thus always be associated to some measure of disorder, in the sense that if f g, then h α (f ) ≤ h α (g), as a consequence of the Schur-concavity of h α . More generally, all Schur-concave functions φ can be understood in view of Definition 4 as generalized measures of disorder that are consistent with majorization theory: if f is more ordered than g as expressed by f g, then it has a lower measure of disorder To be complete, we mention another characterization of continuous majorization that is based on semidoubly stochastic operators, in analogy with the characterization of discrete majorization between infinite probability vectors in terms of semidoubly stochastic matrices where B : A × A → R is the kernel of some semidoubly stochastic operator, i.e., B(r, s) ≥ 0, ∀r, s, B(r, s) dr = 1, ∀s, and B(r, s) ds ≤ 1, ∀r. Then f g.
In contrast with Propositions 1 and 2, Proposition 3 only provides a sufficient condition for majorization. Note that if the probability distributions were defined over a finite-size domain A, condition (14) has then been proven to be necessary as well [41]. However, to our knowledge, there is no proof of the equivalence between the existence of a semidoubly stochastic operator and a majorization relation for distributions defined over an infinite-size continuous domain (although it is plausible that it holds when the domain is R n ). This is a current topic of research in mathematics, see Refs. [42,43].
In order to prove our main result in Sec. 3, we will actually use another sufficient condition for majorization (see Lemma 2), which gives a very clear interpretation of the meaning of f g .

Phase-space majorization
We are now in position to apply majorization theory in quantum phase space and formulate our main conjecture in proper mathematical terms. We consider a single-mode bosonic system modelled by a quantum harmonic oscillator, e.g., a mode of the electromagnetic field. For the sake of simplicity, we take the convention = 1, so the observablesx andp obey the commutation relation [x,p] = i. By defining the annihilation operatorâ = (x + ip)/ √ 2, the creation operatorâ † = (x − ip)/ √ 2, and the number operator n =â †â , the Hamiltonian of the system takes the simple formĤ =n + 1/2. Its eigenstates are the Fock states, denoted as |n for n = 0, 1, · · · , and associated with the wave functions [22] where H n are Hermite polynomials. Any state ρ of the harmonic oscillator can be associated with its Wigner function [22] W (x, p) = 1 π e 2ipy x − y|ρ|x + y dy, (16) which reduces to Eq. (1) if ρ denotes a pure state.
In particular, the Fock states |n are associated with the Wigner functions [22] W n (x, p) where L n are Laguerre polynomials. Note that W n (x, p) exhibit a rotational symmetry, making Fock states phase-invariant. In fact, any phaseinvariant state can be expressed as a mixture of Fock states. As we shall see, the Fock state associated with n = 0 (i.e., the vacuum state) plays a key role with regard to majorization. It admits the Wigner function which we write in short as W 0 (r) = exp −r 2 /π, using the non-negative parameter r such that r 2 = x 2 + p 2 [note the slight abuse of notation as W 0 is used both as a function of r and (x, p) hereafter]. According to Hudson's theorem, it is the only pure state admitting a positive Wigner function (up to symplectic transformations). Frow now on, let us restrict to Wigner-positive states and denote the set of Wigner functions that are positive everywhere as W + . Clearly, all Wigner functions in W + are genuine probability distributions, so that a question that arises naturally is whether the majorization relation introduced in Section 2 has a meaning when applied to (positive) Wigner functions. A first clue that this may be the case follows from the Wigner entropy [26] of a quantum Wigner-positive state, which is the Shannon differential entropy of the Wigner function associated with the state, denoted as h(W ). In Ref. [26], it is conjectured (and proved in some special cases) that Since h(W ) can be understood as a special case of a functional φ of the Wigner function W , as in Eqs. (5) and (6), it is striking to conjecture that Eq. (19) is a consequence of a fundamental majorization relation, following the lines of Eq.
. This is what we do now.

Majorization conjecture
Let us denote as W pure + the set of pure Wignerpositive states, which is a subset of W + . As a consequence of Hudson's theorem, W pure + only contains Gaussian pure states and actually contains all of them. Remarkably, the Wigner functions of all Gaussian pure states are level-equivalent since they are all related by symplectic transformations in phase space. Indeed, a symplectic transformation is an affine linear map on the vector r = (x, p) T , namely r → r ≡ S r + d where S is a symplectic matrix and d is a displacement vector, see [25] for mode details. By denoting the Wigner function before and after the symplectic transformations as W (r) and W (r ), respectively, we have W (r ) = W (r)/| det S|. By doing the change of variables, the level-functions corresponding to W and W can then be shown to coincide as where we have used the fact that det S = 1 for a symplectic transformation. Hence, all Gaussian pure states have a Wigner function that is level-equivalent to W 0 , making them all equivalent to W 0 from the point of view of majorization, namely With this in mind, our main conjecture can be restated as follows: This expresses that, in the sense of majorization theory, the most fundamental (Wigner-positive) state is the vacuum state, i.e., the ground state of the Hamiltonian of the harmonic oscillator. Note that conjecture (22) goes beyond the scope of quantum optical states and applies to the phase space associated with any canonical pair (x, p). Furthermore, it is unrelated to the Hamiltonian of the system : the (positive) Wigner function of any state of the system must always be majorized by function (18). As discussed in Sec. 4, function (18) can therefore be associated with the lowest-uncertainty state, even if it entails the lowest energy for the harmonic oscillator only.

Restricted proof
In this Section, we make a first step towards solving conjecture (22) by considering a particular subset of Wigner-positive states, namely phaseinvariant states that are restricted to two photons at most, where p 1 , p 2 ≥ 0 and p 1 + p 2 ≤ 1. These states form a convex set given by the area whose outer boundaries are the p 1 -axis, the p 2 -axis and the line satisfying p 1 + p 2 = 1 as pictured on Fig. 3. We will prove conjecture (22) for the set of Wigner-positive states of the form (23), denoted as W restr + , which has previously been studied in [26,28]. It is obvious to see that W restr + forms a convex set as well since any convex mixture of Wigner-positive states is Wigner-positive, but the boundary of this set is nonetheless non trivial, see Fig. 3. At the same time, this set is simple enough to enable a fully analytical proof of the conjecture (22).
As shown in [26], the boundary of the set W restr + comprises the extremal states ρ a , ρ b , ρ c and ρ d , represented by the corresponding letters in Fig.  3, as well as the segment of an ellipse connecting ρ c to ρ d . Thus, any state in W restr + can be written as a convex mixture of these extremal states.
Note that ρ a = |0 0|, which lies at the origin in Fig. 3, is simply the vacuum state which will be proven to majorize every other state. The expressions of the Wigner functions of the first four extremal states (as a function of the parameter r) read as follows [26]: In addition to these, there is a continuum of extremal states located on the segment of an ellipse connecting point c to point d in Fig. 3. Using a parameter t ∈ [0, 1], the Wigner function of these states can be parametrized as follows [26]: We are now going to prove that Eq. (22) holds for all states contained in the convex set W restr + . To do so, it is sufficient to prove that the Wigner functions of all the extremal states are majorized by the Wigner function of the vacuum W 0 . As a consequence of Eq. (10), this will indeed automatically imply that the same majorization relation holds for all convex mixtures of extremal states, hence for all states in W restr + . In order to prove our result, we begin by showing that a majorization relation on radial functions in R n (here, we only need n = 2) is equivalent to a majorization relation on specific functions defined on the non-negative real line. This is the content the following lemma, which we prove in Appendix A.1.
Lemma 1. If f and g are two n-dimensional radial distributions defined on R n such that f (r) = f R ( r ) and g(r) = g R ( r ) with f R and g R defined on R + , then f g is equivalent tõ f g, wheref andg are one-dimensional dis- Lemma 1 implies that a majorization relation between any two Wigner functions picked from W 0 , W b , W c , W d and V t is equivalent to a majorization relation between the corresponding one-dimensional functions picked from f 0 , f b , f c , f d , and g t , which are defined on R + as and (27) Thus, we need to prove now that f 0 majorizes f b , f c , f d , and g t . Our proof relies on the following lemma, which we prove in Appendix A.2 for completeness, as we could not find it in the literature.

Lemma 2. Consider two probability distributions f and g defined on the same domain A. If there exists a collection of level-equivalent distributions f (α) on A depending on the parameter
where k : Ω → R + is a probability measure on Ω, then f g.
Lemma 2 enables us to prove that f g provided that we can build g as some convex mixture of distributions that are level-equivalent to f . Note that it is very similar in its spirit to the characterization of discrete majorization in terms of convex mixtures of permutations [2]. However, while the latter gives a necessary and sufficient condition, Lemma 2 only provides a sufficient condition for majorization. Although it is unknown, to our knowledge, whether it could be promoted to an equivalence (similarly to Proposition 3), a sufficient condition is all we need to prove the results of our paper.

Case of f b and f d
Let us first prove that f 0 majorizes f b and f d .
In order to make use of Lemma 2, we are going to build an appropriate collection of levelequivalent functions to f 0 . One simple way to generate level-equivalent functions is simply by shifting the original function to the right. Starting from f 0 , we define the functions f (α) 0 labelled by the non-negative shift parameter α as where Θ(z) represents the Heaviside step function. We obviously have that f (α) 0 ≡ f 0 for all α ≥ 0. Now, define the probability densities k b (α) = exp(−α) and k d (α) = α exp(−α), with α ∈ R + . It is trivial to verify that k b(d) (α) ≥ 0 for all α ∈ R + and k b(d) (α) dα = 1. Furthermore, it can easily be shown that

Case of f c and g t
The same method is not directly applicable to prove that f 0 majorizes f c and g t because the latter functions are non-zero at the origin. The trick, however, is to exploit the fact that f c looks like a rescaled version of f d in the domain [1, ∞). We can then "split" f c into two parts and prove the majorization relation separately for each part. This is possible as a consequence of the following lemma, which we prove in Appendix A.3.

Lemma 3.
Consider four functions f 1 , f 2 , g 1 , and g 2 defined on the same domain A and such that f 1 and f 2 do not both take non-zero values in the same element of A, and similarly g 1 and g 2 do not both take non-zero values in the same element of A. If the functions satisfy f 1 g 1 and f 2 g 2 , then (f 1 + f 2 ) (g 1 + g 2 ).

In light of Lemma 3, define the two functions f
and Obviously, we have f − c + f + c = f c . In order to prove that f 0 f c by using Lemma 3, we also  need to "split" f 0 into two parts f − 0 and f + 0 such that f − 0 + f + 0 = f 0 . Moreover, in order to be able to apply majorization on each part, f − 0 must have the same normalization as f − c , and similarly for f + 0 and f + c . Define x * = 1 − ln 2, and note that With this in mind, the functions f − 0 and f + 0 on R + are and It follows from ( are proportional to f 0 (x) and f d (x), respectively, with the same proportionality factor of 2 exp(−1). Since we have already shown that f 0 f d , it follows that we note that f − 0 and f − c are both monotonically decreasing functions, so they coincide with their decreasing rearrangements, namely Therefore, their cumulative integrals are simply given by (38) In order to prove the majorization relation, we will now show that S f − 0 (s) ≥ S f − c (s) for all s ∈ R + . Since f − 0 and f − c are both monotonically decreasing functions, since f − 0 (x) = 0 for all x > x * , and since x * < 1, it is sufficient to show that Using Definition 3 of a majorization relation, we Finally, the same "splitting" technique can be used to prove that f 0 g t for all values of t ∈ [0, 1], which of course includes f 0 f d and f 0 f c as limiting cases for t = 0 and 1, respectively. We point the interested reader to Appendix B for such a proof. In summary, we have thus shown that all functions (i.e., f b , f c , f d and g t for all t ∈ [0, 1]) are majorized by f 0 . Using Lemma 1, this translates into the fact that the Wigner functions of all extremal states (i.e., W b , W c , W d , and V t for all t ∈ [0, 1]) are majorized by W 0 . Hence, any convex mixture of these extremal Wigner functions is also majorized by W 0 as a consequence of Eq.

Discussion and conclusion
In the present work, we have shown that continuous majorization theory proves to be an elegant and powerful tool for exploring the informationtheoretic properties of Wigner functions representing quantum states in phase space. While it only applies to states admitting a positive Wigner function, continuous majorization should nevertheless pave the way to the proof of various entropic inequalities of interest in quantum physics. This is so because, as explained in Sec. 2, a continuous majorization relation implies an infinite set of entropic inequalities for the Wigner functions involved. For instance, an interesting application concerns the entropic uncertainty relation due to Białynicki-Birula and Mycielski [44], which can be viewed as the entropic counterpart of the Heisenberg uncertainty relation σ x σ p ≥ 1/2, where σ x and σ p denote the standard deviation of the x-distribution ρ x and p-distribution ρ p of a quantum state. The entropic uncertainty relation reads h (ρ x ) + h (ρ p ) ≥ ln π + 1, where the right-hand side is simply the Shannon (differential) entropy of the Wigner function of the vacuum, which is also equal to the sum of the entropies of its marginals. This inequality is, however, not fully satisfying as it is not saturated for all Gaussian pure states [23]. For Wignerpositive states, the bound on the Wigner entropy h(W ) ≥ h(W 0 ) = ln π + 1 has been conjectured -and proven in some cases -in [26]. Since h(W ) = h (ρ x ) + h (ρ p ) − I(W ), where I(W ) ≥ 0 is the mutual information between x and p, this bound on the Wigner entropy yields a stronger entropic uncertainty relation in the presence of xp correlations. Remarkably, our conjecture (22) that W 0 W for all Wigner-positive states in W + then immediately implies this strong bound on the Wigner entropy 1 . It actually also implies similar bounds on all Rényi entropies h α (W ) as well as on all concave functionals φ(W ) as de- Figure 5: Cumulative integral S W as a function of the area parameter a = πs 2 for positive Wigner functions W . The red curve represents the cumulative integral for W 0 , while each blue curve represents the cumulative integral for a random Wigner-positive state. In total, 1000 random instances of Wigner-positive states are uniformly sampled over the set of mixed states containing up to 2 photons (more details on the numerics are given in Appendix C). The red curve lies above all blue curves, confirming conjecture (22). fined in Eq. (5). This clearly illustrates how majorization theory can serve as an efficient tool towards the strengthening of entropic uncertainty relations.
In this paper, we have proven the fundamental majorization relation W 0 W restricted to states in the set W restr + of phase-invariant states with two photons at most. We believe the same techniques laid out here could be exploited to prove our majorization relation for all states in W + . We choose, however, to leave this investigation for future work but we provide numerical evidences supporting our conjecture. In Figure  5, we plot the cumulative integral of the Wigner function W 0 and compare it with the cumulative integrals of the Wigner functions W of some randomly chosen (hence, non-Gaussian) Wignerpositive states. We see that S W 0 (s) ≥ S W (s) for all s ≥ 0, which confirms that W 0 W in view of Definition 3. In Figure 6, we show that the Rényi entropy h α of the Wigner function W 0 is smaller, for several values of the parameter α, than its value for the same Wigner functions W . This is again consistent with W 0 W in view of Definition 4 and the fact that h α is Schur-concave. The interested reader can find the details on our numerical method in Appendix C.
Interestingly, our conjecture (22) bears some resemblance with the so-called generalized Wehrl's conjecture [45]. It has indeed been proven by Lieb and Solovej [46,47] that any concave function of the Husimi Q-function of a state is lower bounded by the same function applied to the Husimi Q-function of the vacuum state (or any coherent state). This is actually equivalent to stating that the Husimi function of the vacuum majorizes the Husimi function of any other state. Intriguingly, this is not proved in [47] with continuous majorization, but using discrete majorization for finite-dimensional spin-coherent states followed by some limiting argument. Following the same line of thought as in [26], our conjecture (22) actually implies the generalized Wehrl's conjecture (since any Husimi function is also the Wigner function of another physical state) but the converse is not true as it is easy to produce positive Wigner functions that do not coincide with the Husimi function of a physical state.
Beyond proving our majorization relation (22), the next step would of course be to extend it to all Wigner-positive states in arbitrary dimensions. We conjecture that any N -mode Wigner-positive state has a Wigner function that is majorized by that of any N -mode pure Gaussian state. A more challenging direction of research would then be to account for partly negative Wigner functions. One way of doing so could be to apply a majorization relation to a carefully chosen non-negative distribution that characterizes any Wigner function. Another way could be to extend the notion of majorization to partly negative functions defined on R n (this is known to be possible for functions defined on a finite domain [2]), which would imply proper inequalities involving the entropies of the marginal of any Wigner function. However, we do not expect a straightforward extension of our majorization conjecture to exist for Wigner-negative states. Indeed, consider the negative volume of a Wigner function defined as Vol − (W ) = − [W (x, p)] − dx dp. It is easily seen that the function ϕ(x) = − [x] − = − min(x, 0) is convex, so that Vol − (·) is Schurconvex. As a consequence, we understand that any Wigner function with non-zero negative volume cannot be majorized by a Wigner function with zero negative volume, such as the one of a Gaussian pure state. Overall, we anticipate that the application of continuous majorization theory to Wigner functions will prove very fruitful for elucidating the properties of quantum states in phase space.
since t ≥ 0. The same applies to g 1 and g 2 . Let f = f 1 + f 2 and g = g 1 + g 2 . We have: where the inequality follows from f 1 g 1 and f 2 g 2 .

B Proof of the majorization relation for the states located on the ellipse
In this Appendix, we prove that the Wigner functions of the extremal Wigner-positive states located on the ellipse represented in Fig. 3 are majorized by the Wigner function of the vacuum state, by showing that f 0 g t for all t such that 0 ≤ t ≤ 1. Note that the proof is very similar to the proof of f 0 f c . The function g t (x) defined in Eq. (27) has one zero at x = a t , where We "split" g t in two different functions g − t and g + Figure 7: Logarithmic plot of P N as a function of N , where P N is the probability to get a Wigner-positive state when uniformly sampling the set of density operators containing up to N photons. For each value of N , we estimate P N from a set of 10 5 random states.
check whether it is positive. If it is the case, the next step is to compare the cumulative integral of W to the one of vacuum. The cumulative integral of W can be approximated in the following manner. We first convert the n L × n L matrix W in a 1 × n 2 L vector v. We then sort the vector by decreasing order which gives us v ↓ , and compute its cumulative sum S: The constant L 2 /(n L − 1) 2 accounts for normalization and corresponds to the surface element associated with a point of the square grid (both ends of the length-L segment are included). Similarly to v, S is a 1 × n 2 L vector. Note that the cumulative sum S is not simply the discretization of the cumulative integral S W (s) as defined in (8).
In the cumulative integral S W (s), the value of the parameter s defines an area equal to πs 2 , whereas the i th component of the cumulative sum S corresponds to an area of i × L 2 /(n L − 1) 2 . Therefore, S and S W are related as: Finally, checking numerically that a positive Wigner function W 1 majorizes another positive Wigner function W 2 amounts to checking that the difference of their respective cumulative sums is positive : Colors are associated to N as follows: red to N = 0, blue to N = 1, orange to N = 2, green to N = 3, purple to N = 4, brown to N = 5, pink to N = 6, gray to N = 7, olive to N = 8, cyan to N = 9, black to N = 10. We observe that cumulative integrals are in general smaller for high N than for small N . Also, notice that every state satisfies the conjecture since the cumulative integral of vacuum (N = 0, red line) is the highest curve.
For simplicity, we introduce the area parameter a = πs 2 and make the slight abuse of notation to designate by S W (a) the function S W (s = a/π).

C.3 Further numerical evidences
Let us first make an observation on the fraction of Wigner-positive states among the set of quantum states. We call P N the probability to get a Wigner-positive state when randomly sampling the set of density operators containing up to N photons (we use a similar idea to what is done in [49]). As one can see in Figure 7, the probability P N decreases exponentially as a function of N . Consequently, it becomes increasingly difficult to randomly generate Wigner-positive states as we expand the size N of the Fock space. For this reason, we have limited our numerical exploration to N = 10. Figure 8 then illustrates the validity of the conjecture for random Wigner-positive states drawn according to the previously described technique up to N = 10.
At this point, it is interesting to mention another technique to generate random Wignerpositive states in a deterministic fashion, i.e., with probability 1 that the random state is Wigner-positive. In [26], we highlighted a particular set of Wigner-positive states built with a balanced beam-splitter. Let us consider a stateσ built from two single-mode statesρ A andρ B as follows:σ whereÛ 1/2 is the two-mode unitary corresponding to a beam-splitter of transmissivity 1/2. Such a stateσ is Wigner-positive for any choice ofρ A andρ B (see [26]). We can thus use Eq. (66) with randomρ A andρ B to generate random states that will be Wigner-positive with certainty. It should be noted, however, that this technique does not span the entire Wigner-positive set since there exist Wigner-positive states that cannot be expressed in the form of (66), see [26]. In (66), notice that ifρ A andρ B are mixed states,σ can be decomposed as a mixture of beam-splitter states built from pure states. To test our conjecture over the set of beam-splitter states, it is thus sufficient to limit our study tô ρ A ,ρ B being pure states (see majorization property (10)). Let us consider pure states |ψ i which are finite superpositions of the first Fock states: where i ∈ {A, B}. With |ψ A and |ψ B containing respectively up to N A and N B photons, the stateσ belongs to the span of Fock states up to N = N A + N B . We have tested the conjecture for various choices of (N A , N B ) such that N A + N B ≤ 10, and have found that each instance satisfies the conjecture. This is illustrated in Figure 9.