Explicit asymptotic secret key rate of continuous-variable quantum key distribution with an arbitrary modulation

We establish an analytical lower bound on the asymptotic secret key rate of continuous-variable quantum key distribution with an arbitrary modulation of coherent states. Previously, such bounds were only available for protocols with a Gaussian modulation, and numerical bounds existed in the case of simple phase-shift-keying modulations. The latter bounds were obtained as a solution of convex optimization problems and our new analytical bound matches the results of Ghorai et al. (2019), up to numerical precision. The more relevant case of quadrature amplitude modulation (QAM) could not be analyzed with the previous techniques, due to their large number of coherent states. Our bound shows that relatively small constellation sizes, with say 64 states, are essentially sufficient to obtain a performance close to a true Gaussian modulation and are therefore an attractive solution for large-scale deployment of continuous-variable quantum key distribution. We also derive similar bounds when the modulation consists of arbitrary states, not necessarily pure.


Introduction and main results
Quantum key distribution (QKD) allows two distant parties with access to a quantum channel and an authenticated classical channel to share a secret key that can later encrypt classical messages [34,38]. While the first protocols such as the celebrated Bennett-Brassard 84 protocol [1] all relied on the exchange of discrete variables (DV) encoded for instance on the polarization of single photons, more recent protocols increasingly rely on a continuous-variable (CV) encoding in the quadratures of the quantified electromagnetic field, that benefits from state-of-the-art techniques in coherent optical telecommunication. This is particularly interesting since we are still at the early stages of a possible large-scale deployment of QKD, a deployment that would be greatly facilitated if the required technologies for QKD were fully compatible with standard telecom equipment. One can argue that CV QKD satisfies this description since the quantum part of the protocol consists in the exchange of coherent states modulated in phase-space and measurement with coherent detection. Roughly speaking, the main difference with classical coherent optical communication is that CV QKD works in the quantum regime with attenuated coherent states and low-noise detectors.
CV QKD comes with some difficulties, however. In particular, security proofs for CV QKD are more complex since one cannot avoid a description in the full infinite-dimensional Fock space, while DV QKD protocols can more conveniently be described with Hilbert spaces of small dimension, making their theoretical analysis simpler. The crux of the problem is that one needs to be able to gather some statistics in the protocol (typically characterizing the level of correlations between the states sent by the first party, Alice, and the data obtained by the second party, Bob) and to infer how much information was obtained by a potential adversary controlling the quantum channel. In a DV protocol, the quantum channel acts on a low-dimensional quantum system and can therefore be relatively well constrained by measuring simple quantities like the quantum bit error rate. For a CV protocol on the other hand, the quantum channel acts on the full Fock space and is usually more difficult to characterize from easily accessible statistics.
At the moment, the only CV QKD protocols with a reasonably well-understood security proof are those where Alice prepares coherent states with a Gaussian modulation 1 . This means that for each use of the channel, she draws a random complex variable α from a Gaussian distribution and sends the coherent state |α = e −|α| 2 /2 ∞ n=0 α n √ n! |n to Bob. If Bob's measurement is a heterodyne detection, this corresponds to the no-switching protocol [43]. The phase-space symmetries of this protocol allow one to apply the Gaussian de Finetti theorem which asserts that Gaussian attacks are asymptotically optimal [23,24]. In other words, forgetting for the moment about finite-size effects, one can simply assume that the unknown channel between Alice and Bob is the Gaussian channel compatible with the statistics observed by Alice and Bob.
Unfortunately, a Gaussian modulation is merely a theoretical idealization since in practice modulators have a finite range and precision, meaning that the true number of states possibly available is finite. For instance, if the modulator has 8 bits of precision, we get 2 8 = 256 values per quadrature and 2 16 = 65 536 possible coherent states. While this number certainly looks large, is it really the case that a CV QKD protocol with this many states automatically inherits the security guarantees derived for a Gaussian modulation? Ref. [20] looked at this specific question and found that, modulo some mild additional assumptions, it seems likely that the asymptotic secret key rate would be close to that of the Gaussian modulation for constellations of size greater than 5000. The approach there is to show that if the constellation is sufficiently close to the Gaussian one, then it is possible to exploit continuity bounds on the secret key rate together with the established security proofs for the Gaussian modulation in order to get reasonable numerical bounds for the secret key rate, when the constellation is large enough. This method, however, does not seem well-suited to address the case of significantly smaller constellation sizes.
At the other end of the spectrum, it is tempting to drastically reduce the number of coherent states in the constellation to simplify as much as possible the hardware requirements of the protocols as well as the reconciliation procedure (where Alice and Bob extract a common raw key from their correlated data). Protocols with 2, 3 or 4 coherent states have been considered in the literature and are part of the general class of M -PSK (phaseshift keying) protocols where Alice sends coherent states of the form |α k = |αe 2πik/M for some α > 0 [2,16,17,25,28,30,33,39,47]. While M = 2 or 3 appear to be too small to yield good performance, the 4-PSK (also known as quadrature phase-shift keying, QPSK) modulation scheme has attracted some interest since it performs reasonably well, although quite far from a Gaussian modulation. Until recently, before the works of Refs [12,27], all 1 Another CV QKD protocol with a full security proof relies on the exchange of squeezed states, combined with a homodyne measurement for Bob (that is, Bob measures only one of the two quadrature operators). This protocol is however significantly less practical than protocols with coherent states [3,9]. the security proofs for the QPSK protocol were restricted to the class of Gaussian attacks (meaning that the quantum channel is assumed to be Gaussian 2 ); it is believed that such attacks are not optimal for these protocols. The strategy in both Refs [12,27] consists in expressing the asymptotic secret key rate as a convex optimization problem, and more precisely a semidefinite program (SDP). The main difference between the two papers is that Ref. [12] considers a linear objective function, while Ref. [27] relies on a tighter nonlinear objective function. While the latter case is expected to give a better bound (at the price of being much more computationally intensive), the results cannot be directly compared since the models and assumptions for the error correction part of the protocol are very different (see Section 10.2 for a discussion of this point). In both cases, a truncated version of the relevant SDP is solved numerically: this means that the operators are described in a truncated Fock space, spanned by Fock states with less than N max photons, typically between 10 and 20 photons. Very recently, Ref. [41] showed how to get rid of this truncation by introducing extra constraints in the SDP, namely constraints on the fourth moments of the data obtained by Alice and Bob. If the approaches of [12,27,41] can in principle be adapted to arbitrary modulation schemes, they are numerically intensive 3 and it is unlikely that they can indeed be easily applied beyond moderately small PSK modulations. In fact, Ref. [33] which only looks at the simpler case of Gaussian (hence likely non optimal) attacks comments that several hours of CPU time are needed to get an accurate bound on the secret key rate.

Results and open questions.
A pressing open question in the field is therefore to obtain reasonably tight bounds for the asymptotic secret key rate of CV QKD with arbitrary modulation schemes, that can be easily computed, without relying on intensive computational methods. Without this, it seems rather hopeless to try to address the next important challenge which will concern the non-asymptotic regime. We solve this problem here: we give an explicit analytical formula for the asymptotic secret key rate of any CV QKD protocol. While we focus more on the case of heterodyne detection, our bounds work just as well for protocols with homodyne detection [15]. Our formula matches the numerical bound from Ref. [12] in the case of M -PSK modulation of coherent states (except in the regime of very low loss combined with high noise, which is not relevant for experiments) and recovers the known values in the case of a Gaussian modulation. Our results show that relatively small constellations of size 64, say, are essentially enough to get a performance close to the Gaussian modulation scheme. A major advantage of the quadrature amplitude modulation such as 64-QAM over QPSK (in addition to the much better secret key rate) is that it allows for implementations with large modulation variance, and therefore bypasses the need to work with an extremely low signal-to-noise ratio (SNR).
Another advantage of our method is that our analytical formula allows one to address the issue of imperfect state preparation. More precisely, in a given protocol, Alice will never be able to prepare the exact states from the theoretical constellation, and will inevitably make some preparation errors. Quantifying their impact on the security is not trivial if one only has access to numerical bounds, but this becomes possible with analytical bounds by analyzing their dependence on the constellation. We show in Section 9 how to modify our bound if Alice sends some (potentially mixed) state τ k instead of |α k . The same bounds also apply to the case of a modulation of single-mode squeezed states, although such protocols are less appealing from a practical point of view.
Yet another advantage of easily computable bounds is that they will allow for a better optimization of the constellation. While the PSK modulation does not offer much freedom since the only parameters are the number of states and the amplitude α of the coherent states, more complex constellations can have many adjustable parameters: the coherent states can lie on a grid, but not necessarily, and one can also freely choose the probabilities associated to each state. In this paper, we focus on simple QAM with equidistant coherent states, and only compare two possible choices for the probability distribution (discrete Gaussian vs binomial). While the precise form of the constellation does not seem to impact the performance too much for a 64-QAM or larger constellations, we expect that smaller constellations will need to be more carefully designed in order to optimize the secret key rate. Such optimizations should include considerations about error correction 4 , and are also beyond the scope of this paper.
A natural open question concerns the case of the QPSK modulation. For this specific choice of constellation, our results (which coincide with Ref. [12]) appear much more pessimistic than those of Ref. [27]. This is due in part to the different choice of objective function and it would be very interesting to understand whether an analytical bound much tighter than ours could be derived explicitly. For larger constellations, our bound is necessarily almost tight since it is very close to the (tight) bound corresponding to a Gaussian modulation (see Section 11).
While we focus on one-way QKD protocols here for simplicity, we note that similar questions are relevant for measurement-device-independent protocols [35]. In that case, both Alice and Bob are expected to send states with a possibly very fine, but discrete, constellation approaching a Gaussian modulation. It would be interesting to understand how to extend our results to this scenario.
The asymptotic secret key rate is an interesting figure of merit that is useful to easily compare various protocols, either DV or CV, under some given experimental conditions. However, it is not quite sufficient to assess the security of a given protocol. What is needed is in fact a composable security proof valid against general attacks, in the finite-size regime. Obtaining such a security proof has turned out to be quite challenging in the case of the Gaussian modulation with a proof based on a Gaussian de Finetti theorem [23] while the asymptotic secret key rate formula was established more than 10 years earlier [10,32]. Similarly, we do not give a full composable security proof here, but show that probably the two most impacting finite-size effects (see discussion in Section 10.1), namely the parameter estimation procedure and the error reconciliation procedure (see discussion in Section 10.2), should not be significantly more difficult to handle than they are in the case of Gaussian modulation.
Structure of the paper. We describe the general form of CV QKD protocols with coherent states in Section 2. We explain in Section 3 how to compute the asymptotic secret key rate given by the Devetak-Winter bound thanks to an equivalent entanglement-based version of the protocol. In Section 4, we define our main lower bound on the Devetak-Winter bound as the solution of a semidefinite program. We study this SDP in Section 5 and establish an analytical lower bound on its value. This bound is our main technical contribution. In Sections 6 and 7, we show how to recover the known bound for protocols with a Gaussian modulation and the known numerical bound for protocols with an M -PSK modulation. We discuss in Section 8 the choice of more complex modulation schemes, namely QAM. We show in Section 9 how to generalize our bound for protocols where Alice sends arbitrary states instead of coherent states. We address some important finitesize effects in Section 10, notably parameter estimation and the reconciliation procedure. Finally, we discuss some numerical results in Section 11.
2 CV QKD protocols with an arbitrary modulation of coherent states Modulation schemes. We consider the following Prepare-and-Measure (PM) protocol where Alice sends coherent states chosen from a discrete modulation to Bob, who measures them with coherent (heterodyne) detection 5 . A heterodyne detection refers here to a double-homodyne detection, where Bob splits the signal on a balanced beamsplitter and measures thex quadrature of the first output mode and thep quadrature of the second output mode. The modulation scheme is defined by a set of coherent states {|α k }, called the constellation, where a state |α k is chosen with probability p k . This information can be summarized by a density matrix τ given by the weighted mixture of coherent states, and corresponding to the average state sent by Alice: (1) Note that for any finite constellation, this state faithfully describes the modulation scheme since the coherent states |α k are linearly independent (this will no longer be the case in general if Alice sends mixed states, e.g. thermal states). An important parameter is the variance of the modulation. In this paper, we define the quadrature operators byx :=â+â † andp := −i(â−â † ), whereâ andâ † (resp.b,b † ) are the annihilation and creation operators 6 on Alice's system (resp. Bob's system), and get the commutation relation [x,p] = 2i. The covariance matrix Γ τ of the state τ is defined by where we assumed without loss of generality that the first moment of the displacement operator vanishes (this can always be enforced by a suitable translation in phase-space).
We have for instance 1 2 ( x 2 τ + p 2 τ ) = tr(τ (1 + 2â †â +â 2 +â †2 )) = 1 + 2 n , where the average photon number n in the modulation is defined as It is also customary to refer to 2 n as the modulation variance V A so that 1 2 There are two main modulation schemes usually discussed in the literature: the Gaussian modulation and the M -PSK modulation. In the case of a Gaussian modulation of variance 1 + 2 n , the value of α is an arbitrary complex number chosen according to a Gaussian probability distribution, and the associated density matrix τ G is a thermal state: where |n :=â †n √ n! |0 is the Fock state with n photons. In the M -PSK modulation case, Alice chooses uniformly at random a coherent state from the set {|αe 2πik/M } 0≤k≤M −1 where the modulation variance corresponds to V A = 2α 2 . The corresponding mixture is Note that the case M = 4, also referred to as quadrature phase-shift keying (QPSK), has been widely studied in the context of CV QKD. The Gaussian and M -PSK modulation schemes are discussed in more details in Sections 6 and 7, respectively.  In coherent optical communications, it is known that increasing the value of M beyond 10, say, is not beneficial and that it is more efficient to switch instead to a different modulation scheme altogether. One such example is quadrature amplitude modulation (QAM) where the constellation typically consists of M points distributed over a square grid (see Figure 1). It is typical to consider M to be a power of 4, and we will indeed consider 4-QAM (which corresponds to QPSK), 16-QAM, 64-QAM, 256-QAM and 1024-QAM in this paper. Given that our proof technique will work better when a modulation scheme is closer to the Gaussian modulation, it is crucial that the M points of the QAM are not chosen with a uniform probability distribution. Rather, we will consider probabilistic constellation shaping [11,18] where each coordinate of the coherent state |α k is chosen independently according to either a binomial or a Gaussian distribution (see Section 8 for details). More complex constellations are also possible.
The Prepare-and-Measure (PM) CV QKD protocol. Any QKD protocol consists of two main parts: a quantum part where Alice and Bob exchange quantum states and obtain correlated variables, and a classical post-processing procedure aiming at extracting two identical secret keys out of the correlated data. We have already described the first part. Alice and Bob repeat a large number of times the following: Alice chooses an index k with probability p k and sends the corresponding coherent state |α k to Bob through an untrusted quantum channel; Bob measures each incoming state with heterodyne detection 7 obtaining a complex number β. At the end of this first phase, Alice and Bob both hold a string of complex numbers. The goal of the second phase of the protocol is to use classical post-processing to transform these two strings into identical secret keys. It requires four steps: (i) Bob discretizes his variables by choosing an appropriate binning of the complex plane 8 ; (ii) in the reconciliation step, he sends some side-information to Alice via the classical authenticated channel in order to help her guess Bob's string 9 , (exploiting the side information together with her knowledge of the states she has sent); (iii) Alice and Bob perform parameter estimation in order to bound how much information was possibly obtained by a malicious eavesdropper; and (iv) they perform privacy amplification in order to obtain a shorter shared bit string completely unknown to the adversary. All these steps must be carefully analyzed for a full security proof, but since our goal is the asymptotic regime, we will only mainly comment the reconciliation procedure and the parameter estimation step in Section 10.

Entanglement-Based protocol and Devetak-Winter bound
In order to analyze the security of a PM protocol as defined in the previous section, the standard technique consists in defining an equivalent entanglement-based (EB) version of the protocol, which only differs from the practical protocol in Alice's lab. Since both protocols are indistinguishable from the perspective of Bob and the adversary, they share the same security.
The EB version of the protocol is as follows: Alice prepares a bipartite state |Φ AA , which is a purification of τ , and measures the first mode in a basis that projects the second mode A onto the coherent states corresponding to the modulation scheme of the PM protocol. In this version, the second mode A is sent through the quantum channel N A →B (controlled by the adversary), and Bob obtains the output mode B. We denote by ρ AB = (id A ⊗ N A →B )(|Φ Φ| AA ) the state shared by Alice and Bob after each use of the channel, where id A stands for the identity channel acting on system A. In the present paper, we study so-called collective attacks in the asymptotic regime, and therefore assume that the channel is always the same (but unknown) during the protocol, which means that Alice and Bob share a large number of copies of the state ρ AB . We note that collective attacks are usually optimal among all possible attacks in the asymptotic limit [36], and it therefore makes sense to consider these attacks here.
The well-known Devetak-Winter bound gives the achievable secret key rate K (per channel use) in this setup [6]: where I(X; Y ) is the mutual information between Alice and Bob's classical variables X and Y (which are complex variables in a protocol with heterodyne measurement, and real variables for homodyne measurement) and χ(Y ; E) is the Holevo information between Y and the quantum register E of the adversary, with the supremum computed over all choices of channels N : A → B compatible with the statistics obtained by Alice and Bob during the parameter estimation phase of the PM protocol. The register E of the adversary is introduced via the isometric representation of the quantum channel, U A →BE , which allows one to write a purification ρ ABE of ρ AB : where the map M : B → Y describes the (trusted) Gaussian measurement performed by Bob. In the case of a heterodyne measurement, it is given by where {|β cl } is an infinite orthonormal family of states storing the value of the measurement outcome. The Holevo information χ(Y ; E) is computed for the state ρ AY E , and the supremum can also be computed over such states that are compatible with the statistics obtained in the parameter estimation step.
In the finite-size regime, it is not quite possible for Alice and Bob to perfectly extract all their mutual information, and it is customary to replace I(X; Y ) by βI(X; Y ) where the reconciliation efficiency β is a parameter that quantifies how much extra information Bob needs to send to Alice through the authenticated classical channel for her to correctly infer the value of Y . Modern techniques usually allow one to get β ≥ 0.95. In any case, the value of βI(X; Y ) can be observed during a given protocol Bounding the value of sup N :A →B χ(Y ; E) is more complicated, however, since it involves an optimization over a family of infinite-dimensional quantum channels. A very useful tool in this setting is the extremality property of Gaussian states, which essentially asserts that the supremum of χ(Y ; E) in Eqn. (2) is upper bounded by the value of χ(Y ; E) computed for the Gaussian state ρ G AY E with the same covariance matrix as ρ AY E [10,32]. In other words, it is bounded by a function that only depends on the covariance matrix of ρ AY E , and even on the covariance matrix of ρ AB since the map M B→Y is fixed by the protocol and ρ ABE is an arbitrary purification of ρ AB . The covariance matrix of ρ AB is defined as where we assume again without loss of generality that the first moment of the displacement operator vanishes. Symmetry arguments (see e.g. Appendix D of Ref. [22]) show that Γ can be safely replaced by Γ when computing the secret key rate, with where the real numbers V, W, Z are given by and σ Z is the Pauli matrix diag(1, −1). The Holevo information χ(Y ; E) computed for the Gaussian state with covariance matrix Γ is given by where g(x) := (x + 1) log 2 (x + 1) − x log 2 (x), ν 1 and ν 2 are the symplectic eigenvalues of Γ and ν 3 depends on the choice of measurement setting (homodyne or heterodyne). The value of ν 3 is given by in the homodyne case [45].
We note that both X and Y correspond to the expectations of local observables, namely 1 + 2â †â and 1 + 2b †b . In particular, X is simply a parameter of the protocol, which is independent of the quantum channel between Alice and Bob. It is customary in the literature to write it as where V A stands for the modulation variance. In general, this parameter can be optimized so as to maximize the secret key rate in a given experiment. For protocols with a Gaussian modulation, it is known that the optimal value of V A becomes larger and larger as the reconciliation efficiency β gets closer and closer to 1. For discrete modulation schemes, such as the QPSK modulation, the optimal value of V A is much lower, and can even be significantly lower than the shot noise with current security proofs [12,27]. The expectation W is not fixed by the protocol, but can be measured locally by Bob who performs a heterodyne detection. The remaining quantity, Z := tr(ρ C) with will be the central object in the present work. If it could be measured directly in the protocol, then Alice and Bob would know the covariance matrix Γ and immediately get a bound on Eve's information. In particular, in any EB protocol, it is sufficient for Alice and Bob to both perform coherent measurements (homodyne or heterodyne) to obtain the covariance matrix. The security of such protocols is therefore well understood. Unfortunately, these EB protocols are much less practical than PM protocols with a discrete modulation of coherent states, since they require the preparation of entangled states. For PM protocols, the state ρ AB does not actually exist in the lab. It is simply a convenient mathematical object, allowing us to discuss the security of the protocol. Consequently, it is in general impossible to infer what value Z Alice and Bob would obtain if they really had access to ρ AB . It is therefore necessary to find some indirect approach in order to get some bounds on Z = tr(ρ C). Protocols with a Gaussian modulation (of Gaussian states) are an exception: in this case, one can easily compute this covariance matrix, and in particular the value of Z = tr(ρ C) from the data observed in the PM protocol [14]. The reason for this is that the measurement performed by Alice in the EB protocol is a Gaussian measurement, and therefore the observed statistics are sufficient to infer the covariance matrix. This is no longer the case for schemes with a discrete modulation: in that case, Alice performs a non-Gaussian measurement on the mode A of ρ AB and this is in general insufficient to deduce the value of tr(ρ C), except by restricting the class of considered attacks [25,26]. The main result of Ref. [12] was to show that even if the exact value of tr(ρ C) cannot be recovered, it is still possible to obtain some bounds on this quantity by expressing it as the objective function of a semidefinite program.

Definition of the SDP and explicit solution
Our first goal is to specify the SDP we want to solve. As mentioned, the objective function is simply tr(ρ C) where ρ AB is the state shared by Alice and Bob, before they measure it, in the EB version of the protocol. In order to get the tightest possible bounds on the value of tr(ρ C), we need to impose some constraints on the possible states ρ AB that should be considered. These constraints have two origins: a first constraint merely says that ρ AB is obtained by applying some channel N A →B to |Φ AA ; the other constraints come from observations made during the parameter estimation phase of the PM protocol.
The first constraint turns out to be which results from the fact that where we defineτ to be the complex conjugate of τ in the Fock basis. The choice of τ may appear arbitrary at the moment, but will become clearer once we explain how to choose the purification |Φ . For the remaining constraints, we recall that Alice sends coherent states |α k to Bob, and that they can gather information about the statistics corresponding to each such coherent state. Obviously, these statistics will need to be estimated properly during the protocol and one should endeavor to reduce the number of independent quantities that need to be estimated, since this number will greatly impact the key rate when taking finite-size effects into account. The results that are readily available in the PM protocol are the first and second moments of the state received by Bob when Alice has sent |α k : where ρ k := N (|α k α k |), as well as the second moment of Bob's state Indeed, let us assume that a random sample of the measurement results of Bob when Alice sent the state |α k are β k,1 , . . . , β k,N , then we expect that Recall that we consider collective attacks here, which means that the state ρ k is always the same (but unknown). Bounding the speed of convergence of these empirical values is not completely trivial since we do not want to assume anything about the distribution of the β k,i but techniques similar to those developed in Ref. [22] can probably solve this issue. In any case, we do not worry about this specific difficulty here since we focus on asymptotic results and therefore assume that Alice and Bob are able to perform the parameter estimation step. As mentioned, we ultimately wish to aggregate such values and only keep a few numbers, much less than M . Let us first relate these values to the bipartite state ρ AB . Without loss of generality, let us write where the {|ψ k } form an orthonormal basis (that we will carefully choose later). With this notation, we obtain The second moment constraint is the easier one to deal with: we simply define the operator where the right-hand side can be measured in the protocol. In order to define the first moment constraints, we need to introduce an operator that will play a central role in our analysis: We will rely on two first-moment constraints: with operators C 1 and C 2 defined by The correlation coefficients c 1 and c 2 can be estimated experimentally by Here, h.c. stands for Hermitian conjugate, and we use· to denote the complex conjugation (with respect to the Fock basis). If we introduce the vectors α := (α k ) k , α τ := ( α k |a τ |α k ) k and β = (β k ) k , then the values of c 1 and c 2 are simply the following inner products: where we define the weighted inner product (x|y) := p kxk y k . Of course, the specific form of the operator C 1 may look somewhat mysterious at this point since it is not clear why the operatorâ τ = τ 1/2â τ −1/2 should play any role at all in the problem, and why c 1 should be a meaningful quantity to estimate during the protocol. The story goes in the other direction: the constraints that should be monitored during the PM protocol are clearly functions of the β k 's, since they are the only observable values in the PM protocol. The simplest such constraints are linear functions in the moments of β k and since our proofs will ultimately rely on the extremality properties of the Gaussian states, it makes sense to focus on the first and second moments 10 . The relevant second moment is the variance of β k , but there is no obvious candidate for the first moment conditions. Our strategy was therefore to optimize the first moment conditions by leaving them as general as possible and only later pick the relevant ones. This is exactly how we arrived at the definitions of C 1 and C 2 .
The constraints of Eqn. (5), (6) and (8) are the only ones we will impose in addition to ρ 0. Since the secret key rate is minimized when the value of Z = tr(ρ C) is minimal 11 , we finally state our main SDP: Our main technical contribution is to provide the following bounds for the interval of possible values for tr(ρ C) under these constraints: where we recall that n = k p k |α k | 2 is the average photon number in the modulation and we define the quantity The Cauchy-Schwarz inequality, |(α|β)| 2 ≤ (α|α)(β|β), implies that the term n B − of the interval in the covariance matrix Γ and computing the associated Holevo bound yields an analytical lower bound on the asymptotic secret key rate of the CV QKD protocol 12 .
We note that an important feature of Z * is that it only involves 3 quantities that need to be determined experimentally. In particular, there is no need for the precise knowledge of all the β k , which would make any finite-size analysis very challenging. At the same time, c 1 is an additional quantity that was not present in previous works, for instance in the definition of the SDP in Ref. [12]. While this difference does not appear in simulations of a Gaussian quantum channel since the ratio between c 1 and c 2 is fixed in that case, it does play a role in a real experiment, and will also impact the finite-size secret key rate since an additional parameter needs to be estimated.
As we discuss in more details in Section 6, a simple calculation shows that a τ G = 1+ n n â and therefore w = 0 in the Gaussian case, recovering the well-known result that the covariance term is completely determined, and hence does not depend on the excess noise, for a Gaussian modulation. In particular, there are only two independent experimental quantities to monitor in that case, c 1 and n B .
Expected bound for a Gaussian quantum channel. The bound of Eqn. (11) can be readily used in any experimental implementation of the protocol, but it is also useful to be able to get an estimate of such a bound for a typical experimental setup. In particular, since most experiments are implemented in fiber, it is typical to model the expected quantum channel between Alice and Bob as a phase insensitive Gaussian channel characterized by a transmittance T and an excess noise ξ. This means that if the input state is a coherent state |α , then the output state is a displaced thermal state centered at √ T α with a variance given by 1 + T ξ. In other words, the random variable β k can be modeled as where γ k is a Gaussian random variable corresponding to the shot noise (of variance 1 with our choice of units) and to the excess noise (of variance T ξ). In this case, one can readily compute the expected values of c 1 , c 2 and n B (see Section 5 for details): which yields a minimum value Z * (T, ξ) = min tr(ρ C) equal to The linear dependence in √ T is expected, and we note that the correction term, scaling like √ ξ, heavily impacts the value of the covariance, for nonzero excess noise, unless w is very small. As we will later see, while W is rather large and leads to rather poor performance in the case of a QPSK modulation with only four coherent states, this is no longer the case for larger constellations, for instance with a 64-QAM of 64 coherent states. Eqn. (14) is generalized to the case of a modulation of arbitrary states in Eqns (??) and (??).

Analytical study of the SDP
In this section, we detail how to obtain a lower bound on the value of the primal SDP of Eqn. (10). In fact, although it is primarily the minimum of the objective function that is relevant for CV QKD, we can more generally aim to find the whole interval of values for tr(ρ C) compatible with the constraints. We start by explaining how to choose a convenient purification of τ and how to model Alice's measurement in the entanglement-based version of the protocol and then proceed to obtain our main result.

Purification of τ
Before proceeding with the change of variables, let us discuss the choice of the purification |Φ for the modulation state τ . We choose By writing the spectral decomposition of τ : we immediately obtain where |φ k is obtained by conjugating the coefficients of |φ k in the Fock basis. Note that we can also write 13 |Φ = (τ 1/2 ⊗ 1) ∞ n=0 |n |n . Consideringτ −1/2 to be the square-root of the Moore-Penrose pseudo-inverse ofτ , equal to the inverse ofτ on its support and to zero elsewhere (recall thatτ = M k=1 p k |ᾱ k ᾱ k | is an operator of rank M since any finite set of coherent states forms an independent family), we have that where Π = M k=1 |φ k φ k | is the orthogonal projector onto the M -dimensional subspace spanned by the (conjugated) coherent states |ᾱ k of the modulation (equivalently, Π is the projector onto the support ofτ ). Note indeed that the |φ k (as well as the |φ k ) are orthogonal since they appear in the spectral decomposition of τ . This means that (τ −1/2 ⊗ 1)|Φ is an M -dimensional maximally entangled state. We define the state |ψ k by 14 Note that From this, we conclude that the family {|ψ k } forms an orthonormal basis for the relevant subspace, and moreover, we obtain 15 An interpretation of the states |ψ k is that they define the projective measurement that Alice should perform in the entanglement-based version of the protocol in order to recover the Prepare-and-Measure protocol: if Alice measures her state and obtains the result indexed by k, then the second mode of |Φ , the one which is sent through the quantum channel to Bob, collapses to |α k .

The Sum-Of-Squares
Now that we have defined the states |ψ k , we are ready to analyze the SDP of Eqn. (10), which we recall here for convenience: In order to get explicit bounds on tr(ρ C) for feasible points of this program, we exploit a standard technique called sum-of-squares. It consists in exhibiting some clever nonnegative operator (namely KK † below) such that we can bound the value of tr ρ(C − KK † ) from the constraints of the program. In that case, we immediately get tr(ρ C) = tr(ρ(C − KK † )) + tr(ρKK † ) ≥ tr(ρ(C − KK † )), where we used that tr(ρKK † ) ≥ 0. Finding an operator K that will give a good bound on the value of the SDP is nontrivial, and the problem is even more complicated here because the relevant operators live in an infinite-dimensional Hilbert space. In a previous version of this manuscript (Ref. [5]), we attacked this problem by first performing a change of variables consisting in displacing Bob's system by −tα k (for an optimized value of t) when the state prepared by Alice is |α k . The advantage of this procedure was that the new state held by Bob has a very low average photon number and is therefore close to the vacuum state (and equal to it when there is no excess noise). It was then possible to guess what 15 To see this, we can simply compute the overlap between this state and the definition (τ 1/2 ⊗ 1) n |n |n : where we used that α k | n |n |n = |ᾱ k andτ −1/2τ 1/2 = Π. would be a good parameterized sum-of-squares. In the present version of the manuscript, we bypass this change-of-variable altogether and directly define the relevant operators: where the scalars t, {y k } k , x and z will be optimized later. The proof ends up being much shorter, involving fewer algebraic operations, but may seem a bit magical. From KK † 0, we infer that tr(ρKK † ) ≥ 0. Expanding this expression, we find Let us consider each of these four terms individually and take their expectation with respect to the state ρ.

The first term
is a quadratic form in x, and is minimal for the choice We then get where we exploited in the second line the fact that the operators A and P all act on the first subsystem and tr B (ρ) =τ . Recalling the definition of the operator aτ :=τ 1/2 aτ −1/2 , the first term becomes tr(τ aΠa † ) = tr(τ a † τ aτ ).
We will optimize the choice of P in order to maximize the fraction Re(tr(τ AP )) 2 tr(τP † P ) . By definition of P , we have: We write y k = x k e iθ k with x k ≥ 0 and choose θ k so that y k ψ k |τ a|ψ k = x k | ψ k |τ a|ψ k | is nonnegative. We obtain where we exploited that |ψ k = √ p kτ −1/2 |ᾱ k in the second equality. We finally obtain for our choice of x and P that where we exploited the invariance of the expression under complex conjugation. This equals z 2 w with w defined in Eqn. (12).
2. We turn to the second term of Eqn. (18). By definition, and therefore In the second line, c.c. stands for complex conjugate. One can simplify the second term further and write it as a function of τ . From This expression is real since it equals the trace of the Hermitian matrixτ 1/4 aτ 1/2 a †τ 1/4 . In particular, it is invariant under complex conjugation, and we finally get the following expression for the second term of Eqn. (18): 3. The third term of Eqn. (18) can be computed directly: where we introduced the Kraus operators {E r } r of the channel N : A → B, which satisfy r E † r E r = 1 in the second line and wrote the bipartite state ρ as With our choice of {y k } k , this simplifies to We also observe that tr(τ AP ) = tr(τ P † P ), which implies that our optimized value of x equals 1. Recalling that ψ k |τ a|ψ k = p k ᾱ k |aτ |ᾱ k , this gives where we exploited the constraint tr(ρ C 1 ) = 2c 1 in the last equality.

The Gaussian modulation
In this section, we show that the formula from Eqn. (14) gives the standard value for a Gaussian modulation [10]. Let us consider a modulation such that τ G has n photons on average: a and we observe that it is simply a rescaling of the original annihilation operator. In particular, coherent states are eigenstates for a τ G and we obtain which shows that w vanishes for a Gaussian modulation. This shows that tr(ρ C) = 2c 1 with c 1 = Re(α τ |β) = 1 + 1 In particular, if the transmittance of the channel is T , meaning that β = √ T α, we get Re(α|β) = √ T n and recover the standard value for a Gaussian modulation tr(ρ C) = 2 √ T n 2 + n .

Interpretation of w.
What is remarkable in the case of a Gaussian modulation is that the quantity w vanishes. Note that w is the expectation of and it vanishes here because each such term vanishes. This results from the fact that any coherent state |α is an eigenstate of the operatorâ τ , which is simply a rescaled version of the annihilation operator in the case of a Gaussian modulation. For other modulation schemes, the operatorâ τ will be slightly different and therefore |α k will in general no longer be an eigenstate. Let us write without loss of generalitŷ where |α ⊥ k is orthogonal to |α k and u k , v k are complex numbers. We get where Π ⊥ k = 1 − |α k α k | is the projector onto the subspace orthogonal to |α k . In other words, w quantifies how much weight from a random input state is mapped byâ τ to an orthogonal subspace.

The M -PSK modulation
The goal of this section is to provide an explicit expression for the value of Z * of Eqn. (14) corresponding to the case of a lossy and noisy Gaussian channel: The state τ takes the following form for an M -PSK modulation consisting of the states |αe ikθ for θ = 2π/M and α > 0: This expression for ν k involves an unnecessary infinite sum and can be simplified. Let us introduce µ j which is obtained by applying a discrete Fourier transform where we used that e ijnθ = e ij(n modM )θ . Applying an inverse Fourier transform gives: We now wish to compute tr(τ 1/2 aτ 1/2 a † ). It is straightforward to check that: where indices are taken modulo M . This gives where the last equality results from the orthogonality of the {|φ k } family. The operator a τ = τ 1/2 aτ −1/2 takes a simple form: We can finally compute w: Putting these results together, we obtain the following value for Z * (T, ξ) for a general M -PSK modulation: We compare in Fig. 2 our analytical bound with the numerical bound obtained in Ref. [12]. We observe that they match up to numerical precision, except in the regime of very low-loss and large excess noise. While this regime is not very relevant for experiments, it would still be interesting to understand how to improve our numerical bound in that case. The question is whether there exists a better ansatz than that of Eqn. (??) more suited to this specific regime.
As we will see in Section 11, the performance of the M -PSK protocols when using the above formula is essentially optimal for M = 4. In fact, the increase in performance when going to M = 5 is very small and M = 6 already reaches the asymptotic limit M → ∞. Of course, it is quite possible that this is only an artefact of our reliance on the extremality of Gaussian states and that the approach of [27] may show that larger values of M are indeed useful.

General constellations
The conclusion of these previous sections is that the bound we obtain for the SDP is indeed tight in the two extreme cases where the constellation is either very small (as in M -PSK) or infinitely large (as in the Gaussian case). For constellations that fall in between, such as the general QAM that we will discuss now, it is not possible to compare our results to any numerical data (since none is available), but it is tempting to conjecture that our bound will likely be close to optimal. The main lesson one can draw from the formula obtained in Eqn. (14) for Z is that the key rate will increase when the modulation scheme gets closer to a Gaussian distribution, and this is mainly quantified by the value of There exist many choices of constellations that can be used to approximate a Gaussian distribution. For instance, the Gaussian quadrature rule is designed to match the first moments of the Gaussian distribution and works well for large constellations. The binomial (or random walk distribution) works much better for small constellations [21,46] and provides a natural candidate for CV QKD applications. The normalized random walk distribution contains m points for each quadrature, which are equally spaced between − √ m − 1 and √ m − 1, with associated probabilities corresponding to the binomial distribution. We choose a variance per coordinate equal to α 2 /2, which translates into tr(τx 2 ) = tr(τp 2 ) = 2α 2 = V A with our convention that [x,p] = 2i. The M = m 2 coherent states |α k, of the modulation are of the form chosen with probability Another simple distribution is the discrete Gaussian distribution, where the coherent states are centered at m 2 possible equidistant points of the form α = x + ip, with a respective probability given by This distribution is characterized by ν > 0 and by the spacing between the possible values of x (or p). This spacing is, however, constrained once we fix the overall variance to α 2 /2 per coordinate. We are then left with a single parameter ν that can be optimized to maximize the secret key rate. As we will discuss in more detail in Section 11, the two modulation schemes yield very close performance for QAM of size 64 or above, once the parameters of the discrete Gaussian distribution have been optimized. For simplicity, it is therefore more convenient to use the binomial distribution which comes without extra-optimization step. However, for smaller constellations, like 16-QAM, it seems that the discrete Gaussian distribution gives better results, and it would be interesting to find out whether other distributions are even better.

Modulation of arbitrary states
Our approach extends to the case where Alice sends arbitrary states τ k , with probability p k , for instance squeezed states [3] or thermal states [8,42]. Besides possible applications such as the application of CV QKD to the microwave regime [44], it is important to be able to analyse the security of the protocol when the state preparation is imperfect since Alice can never prepare the intended states with infinite precision. As an example, a modulation of thermal states consists in sending some displaced thermal state τ k with n th photons centered around α k with probability p k . The state τ k is given by where ρ th is a thermal state centered in phase space and D α k := exp(α kb † −ᾱ kb ) is the operator describing a displacement by α k .
In this section, we will therefore consider the most general setting where Alice picks some index k with probability p k and sends some state τ k , which is arbitrary. The security analysis relies on the same idea as before, that is computing the covariance matrix of the state ρ AB shared by Alice and Bob in the entanglement-based (EB) version of the protocol, and the covariance term can again be bounded with an SDP similar to Eqn. (10).
The modulation is still characterized by its average state and we will keep the same purification as before to analyze the EB version of the protocol: We need to replace the rank-one projector |ψ k ψ k | defined in Eqn. (16) by a positive semidefinite operator These operators yield a resolution of the identity on the support ofτ , the complex conjugate of τ (also equal to the transpose τ T with respect to the Fock basis): where Π is the projector onto the support ofτ . Sinceτ = tr A (|Φ Φ|) corresponds to the reduced state on the system A, we can interpret the family {P k } as the POVM elements of a general measurement performed by Alice on A: whenever she obtains the measurement outcome k, the state of system A collapses to τ k .
Recall that the first-moment values that can be measured in the PM protocol are with α τ = (tr(τ k a τ )) k . These can be expressed as the expectation values of ρ for the observables C 1 and C 2 defined by with z k := tr(τ k aτ ).
We also introduce the operators G 1 , G 2 acting on the system A: and observe that We can now give the relevant SDP when we consider a modulation of arbitrary states: s.t.
Our goal is again to exhibit operators K ± and exploit the operator inequalities K ± K † ± 0 to bound the value of the SDP. We need some additional notations: where {|k } is an orthonormal basis of a reference system R, storing Alice's measurement result. The operators A and B should not be confused with the registers A and B. We recall that the operator D β describes a displacement by β.
We then proceed exactly as in Section 5 and define Considering K ± K † ± 0 results in the sum-of-squares inequality: We take the expectation with respect to the state ρ and consider each term individually.
1. For the first term, we have Their expectation with respect to ρ gives Putting everything together, we get tr(ρ · (1)) = z 2 w, where we define 2. For the second term, we have and the expectation with respect to ρ gives In particular, we can recognize the objective function of the SDP: 3. For the third term of Eqn. (38), we note that The expectation with respect to ρ gives 4. Finally, for the fourth term, we have The expectation with respect to ρ gives tr(ρ · (4)) = 1 By considering the four terms of Eqn. (38), we find that where we used the substitution p k = tr(τ P k P ) in the second equality. Overall, this implies the two inequalities where we optimized the variable z exactly as in Section 5. We note a potential problem in the case where w vanishes: it would then appear that by fixing t arbitrarily, we could obtain any bound about on tr(ρ C). This is not possible, however, since w only vanishes for a Gaussian modulation of coherent states and in this case the second term of the right-hand side also vanishes. More generally, this term vanishes whenever the measurement performed by Alice is projective, in the sense that P k P = δ k, P k , corresponding for instance to an arbitrary modulation of coherent states (or pure squeezed states). Here, we simply choose the value of t that minimizes the term under the square-root (but note that this may be suboptimal in general), namely . This establishes our final bounds: .
It would be interesting to understand whether this lower bound is tight for a Gaussian modulation of thermal states.

Finite-size effects
In this section, we quickly discuss two of the main finite-size effects that will need to be included in a future full composable security proof against general attacks. Another important effect concerns the optimality of collective attacks among general attacks. At the moment, this point still needs to be clarified, and we leave it for future work. Note, however, that the correction term due to this last effect is typically dependent on the proof techniques and we have observed in the past that better techniques can significantly reduce this term. For instance for DV QKD, the first techniques were based on the exponential de Finetti theorem [36], then on a de Finetti reduction [4], then on an entropic uncertainty principle [40] and finally on the entropy accumulation theorem [7]. It is therefore tempting to believe that a similar phenomenon will occur with CV QKD, and this has indeed been the case for protocols with a Gaussian modulation of coherent states where both an exponential de Finetti theorem [37] and a Gaussian de Finetti reduction [23] are known. For these reasons, it makes sense to focus on the two finite-size effects that will likely remain the dominating terms in any future full security proof of CV QKD, namely parameter estimation and reconciliation efficiency.

Parameter estimation
One of the novelties of our proof, when compared to the case of a Gaussian modulation, is the need for experimentally estimating 3 parameters, c 1 , c 2 and n B , in order to get an upper bound on the Holevo information χ(Y ; E) ρ appearing in the Devetak-Winter bound. Let us denote by f (c 1 , c 2 , n B ) this upper bound, which is given explicitly in Eqn. (3), where we compute the symplectic eigenvalues for the covariance matrix Γ = V 1 2 Z * σ Z Z * σ Z W 1 2 with V given by the modulation scheme, W computed from the value of n B and Z * computed from the values of c 1 , c 2 , n B by the formula given in Eqn. (13). We note that the function f depends implicitly on the modulation scheme, for example via the value of w appearing in the expression of Z * .
Since n B is the average photon number in Bob's system, it corresponds to the variance (up to a shift and a factor 2) of his quadrature measurements, when the distribution is centered: One can then compute an observed value n obs B corresponding to the empirical average of n B evaluated on the samples that are used for parameter estimation. In order to estimate c 1 and c 2 , one can for instance form a vector of average observed values β obs = (β obs k ) k where β obs k is the average observed outcome for the observableb = 1 2 (x B + ip B ) when Alice has sent the state |α k , and then compute c obs 1 := Re(α τ |β obs ), c obs 2 := Re(α|β obs ), where the k th entry of the vectors α τ and α are given respectively by α k |a τ |α k and α k .
In the asymptotic setting, one can assume that the values of c 1 , c 2 and n B are known exactly, and therefore coincide with their observed values. This is not the case in the finitesize setting, and one would in general compute a confidence region for the triple (c 1 , c 2 , n B ) compatible with the observed values (c obs 1 , c obs 2 , n obs B ). One can check numerically that the function f (c 1 , c 2 , n B ) is increasing with n B and decreasing with either c 1 or c 2 , when the other 2 variables are fixed. This implies that there is no need for computing the whole confidence region, but it is in fact sufficient to compute "worst-case estimates" for c 1 , c 2 and n B , in the sense that In these expressions, the variables c 1 , c 2 and n B refer to their respective values for the modes that have not been used for parameter estimation, and that will be exploited for key extraction. The numbers c min 1 , c min 2 , n max B are computed with Eqn. (43) below from observations made during the parameter estimation procedure and correspond to the worstcase estimators. The small parameter ε PE is an upper bound on the probability that the parameter estimation performed by Alice and Bob returns c min 1 for instance and that the value of c 1 is less than c min 1 for the remaining unobserved modes. Once these numbers are known, one can simply use the following upper bound on χ(Y ; E) in the Devetak-Winter bound: χ(Y ; E) ≤ f (c min 1 , c min 2 , n max B ), which holds, except with a small probability ε PE .
It is well known that such a parameter estimation is more subtle in the case of CV QKD because the random variables we aim at estimating are not trivially bounded by construction (contrary to the quantum bit error rate of BB84 for instance, which lies by definition between 0 and 1). This difficulty can be addressed with the tools developed in Ref. [22], but this is beyond the scope of the present manuscript. Here, we simply wish to give the expected asymptotic scaling of c min 1 , c min 2 and n max B , as a function of n, the number of quantum states exchanged on the quantum channel: for i ∈ {1, 2}. The precise value of the hidden positive constants in the O(·) notation are not known at the moment, and will require a thorough analysis to determine.

Reconciliation efficiency
The information reconciliation step of the protocol is also more involved for CV QKD than for DV QKD. Without this step, or assuming it is achieved perfectly, the asymptotic secret key rate would read where X and Y denote the variables corresponding to Alice and Bob, and the raw key is given by Bob's variable (which is always the more favorable choice for CV QKD). Since the present paper focusses on the asymptotic regime, one could in principle ignore the reconciliation procedure, but this would lead to incorrect predictions in the case of CV QKD because an imperfect reconciliation significantly affects the performance: for instance, with perfect reconciliation and a Gaussian modulation, the secret key rate is strictly increasing with the variance of the modulation, while this is no longer the case as soon as the reconciliation is slightly imperfect. In a typical DV protocol, Alice and Bob hold correlated bit-strings x = (x 1 , . . . , x n ) and y = (y 1 , . . . , y n ) corresponding respectively to the input and output of n uses of a binary symmetric channel, with crossing probability p. Bob then sends some side-information to Alice via the authenticated classical channel to help him recover the value of y. In the asymptotic limit where n tends to infinity, the channel coding theorem ensures that Alice and Bob can succeed at this task with high probability provided that Alice sends H(Y |X) = H(X|Y ) = nh(p) bits of side information, with the binary entropy defined as h(p) := −p log 2 (p) − (1 − p) log 2 (1 − p). In practice, one cannot achieve this perfectly, and Alice will need to send slightly more information, namely (1 + f (p))nh(p) bits, where f (p) is typically a few percent.
For a CV QKD protocol, the relevant channel in practice 17 is the additive Gaussian white-noise (AWGN) channel: the strings held by Alice and Bob are (x 1 , . . . , x n ) ∈ C n and (y 1 , . . . , y n ) ∈ C n where x i is chosen according to the modulation scheme: it is equal to α k with probability p k . For each i, we expect where Re(z i ), Im(z i ) ∼ N (0, 1 + T ξ) is a Gaussian noise. The extra factor 1/2 in the square-root comes from the heterodyne detection which requires first splitting the incoming signal on a balanced beamsplitter before measuring each output mode with a homodyne detection. In the case of a Gaussian modulation, with Re(x i ), Im(x i ) ∼ N (0, V A ) two Gaussian random variables of variance V A , the mutual information between the random variables X and Y takes a simple expression Note that this is twice the standard formula 1 2 log 2 (1 + SNR) because we consider both the real and imaginary parts.
For the modulation schemes we consider in this paper, there is no closed-form expression for the mutual information I(X; Y ), although it is typically very close to the Gaussian version, provided the variance V A is small enough [46]. Note in particular, that for a 2 k -QAM, it is necessarily upper bounded by k, which is itself an upper bound on the entropy H(X), while log 2 (1 + SNR) grows to infinity with the signal-to-noise ratio. Assuming therefore that the gap between the two quantities is indeed negligible here, we still need to quantify how far we are from the key rate of Eqn. (44). There are two natural ways to write a version of the key rate taking into account the imperfect reconciliation efficiency: where β < 1 is the so-called reconciliation efficiency generally used in CV QKD and f > 0 is more relevant to DV QKD. In the second expression, we write Y to denote a discretized version of Y , since otherwise the conditional entropy is ill-defined. Provided that the reconciliation protocol fully exploits soft-information, meaning that the discretization is sufficiently precise, then high values of β between 95 and 98% are achievable [19,29,31] for a Gaussian modulation. Similarly, for a QPSK modulation, it is possible to easily reach 90% at arbitrarily low SNR. It is not clear, however, how to achieve similar numbers with a coarse graining corresponding to Bob simply keeping the sign of his variable in the QPSK case, as done in Ref. [27].
The reconciliation problem has not yet been studied in detail in the case of larger QAMs. Nevertheless, one can realistically assume that values around 95% can be achieved, given the closeness between this problem and the Gaussian case. For this reason, we will assume β = 0.95 in the numerical simulations of Section 11.

Numerical results
In this section, we perform some numerical simulations in the case of a typical Gaussian channel with transmittance T and excess noise ξ. The covariance matrix Γ takes the form and τ and W depend on the specific modulation scheme that is considered. We first compare in Figure 3 the secret key rates obtained for various sizes of the M -PSK modulation. The left panel shows that when the modulation variance (or equivalently, α) is optimized, then going beyond M = 5 is essentially useless. On the right panel, we see that the only advantage of increasing M is to allow for larger possible values of α. However, it is much better to consider QAM instead of increasing the number of states in the PSK modulation.
In Figure 4, we compare the binomial and the discrete Gaussian distributions discussed in Section 8 in the case of the 16-QAM and the 64-QAM. Note that the two distributions coincide by construction for the 4-QAM (or QPSK modulation). It is clear that for a 64-QAM, both distributions yield essentially the same performance, which is close to that of a Gaussian modulation with the same variance. For the 16-QAM, however, the discrete Gaussian outperforms the binomial distribution, when the value of the parameter ν in Eqn. (27) is optimized. This also suggests that there is still room for further improvement in the case of the 16-QAM (or maybe of the 32-QAM which we have not discussed here mostly because it would break the independence of the real and imaginary parts of Alice's variables, and therefore potentially complicate the reconciliation procedure), and that additional work might lead to the discovery of better modulation schemes. Let us still insist on the fact that here we assume that β is equal to 0.95, independently of the modulation scheme, but that reality is probably more complex. In other words, it is important to also consider the reconciliation procedure when optimizing the modulation scheme.  In both cases, the discrete Gaussian distribution outperforms the binomial distribution, but the difference is only significant for the 16-QAM. Figure 5 shows the performance of the various QAM sizes as a function of the modulation variance V A . Here we only plot the results for the binomial distribution, since this avoids an extra optimization on ν. The main observation is that increasing the size of the constellation brings the performance close to that of the Gaussian modulation for larger and larger values of V A , allowing one to work at higher SNR, and thus simplifes the experimental implementation as well, possibly, as the reconciliation efficiency. At the same time, for a fixed reconciliation efficiency and a given distance (50 km here), we see that the optimal modulation variance is V A ≈ 5 and that the 64-QAM is already essentially indistinguishable from the Gaussian modulation.
Finally, we want to understand the performance of the various modulation schemes in terms of tolerable excess noise: if the transmittance of the channel is fixed to T = 10 −0.02d , what is the maximum value of the excess noise ξ such that the secret key rate is positive? Figure 6 shows the tolerable excess noise as a function of losses in the channel, when the modulation variance V A is optimized for each point. Again, we see that a 64-QAM already provides a performance close to the Gaussian modulation, and the 256-QAM is almost indistinguishable from the Gaussian modulation. The figures also confirm that our bound is quite bad for the QPSK modulation since the tolerable excess noise is at least an order of magnitude below what is achieved for larger QAM.