Security of quantum key distribution with intensity correlations

The decoy-state method in quantum key distribution (QKD) is a popular technique to approximately achieve the performance of ideal single-photon sources by means of simpler and practical laser sources. In high-speed decoy-state QKD systems, however, intensity correlations between succeeding pulses leak information about the users' intensity settings, thus invalidating a key assumption of this approach. Here, we solve this pressing problem by developing a general technique to incorporate arbitrary intensity correlations to the security analysis of decoy-state QKD. This technique only requires to experimentally quantify two main parameters: the correlation range and the maximum relative deviation between the selected and the actually emitted intensities. As a side contribution, we provide a non-standard derivation of the asymptotic secret key rate formula from the non-asymptotic one, in so revealing a necessary condition for the significance of the former.


Introduction
Quantum key distribution [1,2,3] (QKD) is a technique that enables secure and remote delivery of cryptographic keys based on the laws of quantum mechanics. The interest of QKD is that, when combined with the one-time-pad encryption scheme [4], it allows for information-theoretically secure communication, unconcerned about the capabilities of future adversaries and the progress of classical or even quantum computers. For this reason, since its conception in 1984 [5], QKD has experienced a tremendous development both in theory and in practice, in so becoming a commercial technology that represents the most mature application of quantum information science. Nevertheless, various challenges must still be addressed in order to achieve the widespread adoption of QKD.
In real-life implementations, the information carrier of QKD is the quantum of light or photon, and due to the low transmissivity of single photons in typical optical channels -which, for instance, in the case of optical fibers decreases exponentially with the fiber length [6,7,8,9]-one major challenge consists of achieving high secret key generation rates at long distances. For this purpose, one natural approach is to increase the repetition rate of the laser source in the transmitter station. However, for repetition rates of the order of GHz, it has been shown that intensity correlations between succeeding pulses appear [10,11], potentially opening a security loophole.
To be precise, in the absence of ideal single-photon sources, most QKD protocols use simpler laser sources that operate emitting phase-randomised weak coherent pulses (PRWCPs). This is so because PRWCPs allow the QKD users to implement the so-called decoy-state method [12,13,14,15], a technique to tightly lower-bound the extractable secret key length of a QKD session. Importantly, standard decoy-state analyses rely on a fundamental assumption: for any given signal, its detection probability (or so-called "yield") conditioned on the emission of a certain number of photons does not depend on the intensity of the pulse, i.e., on its mean photon number. Nevertheless, this assumption fails in the presence of a side channel leaking information about the intensity setting [16], and for GHz (or higher frequency) repetition rates one such side channel is represented by intensity correlations. Intuitively, the eavesdropper (Eve) could exploit the correlations to gain information about previous intensity settings, which would allow her to make the n-photon yields dependent on them.
This being the case, and aiming to develop ultrafast decoy-state-based QKD systems, the question arises of how to account for arbitrary intensity correlations in the security analysis. In this regard, the existing security proofs are notably restricted. For instance, preliminary results presented in [17,18] deal with setting-choice-independent correlations, which neglect the possibility of information leakage and hence do not cover the major threat. On the other hand, the authors of [10] go beyond setting-choice-independent correlations by providing a post-processing hardware countermeasure whose application is limited to particular instances of nearest neighbors intensity correlations. Similarly, it is worth mentioning the progress reported in [19] to develop an intensity modulator (IM) that mitigates the effect of intensity correlations. Lastly, the recent work in [20] presents a general technique to accomodate various other device imperfections, but it does not incorporate the use of the decoy-state method.
In this work, we provide the missing security analysis for decoy-state QKD with arbitrary intensity correlations. Despite asymptotic, our approach is experimental-friendly in the following sense. In the first place, it only requires to upper bound two parameters presumably easy to quantify in an experiment (see for instance [11]): the correlation range and the maximum relative deviation between the selected intensity settings and the actually emitted intensities (which do not match in general due to the correlations). In the second place, it allows for improved secret key rates in case a specific correlations model is known to describe the IM.
As a side contribution, we use elementary results in statistical convergence to rigorously justify the asymptotic secret key rate formula from the non-asymptotic one. Crucially, the justification relies on a very natural necessary condition on the observables. Whenever not guaranteed by some special symmetry of the protocol (such as the delayed setting choice that enables a counterfactual scenario in the absence of information leakage), this condition must be taken as an assumption, in which case it restricts the capabilities of the adversary. In particular, the formula tolerates a restricted type of coherent attacks that we characterize in detail.
The structure of the paper goes as follows. There are six Results subsections. In the first two, Sec. 2.1 and Sec. 2.2, we present the general physical assumptions we impose on the intensity correlations and provide a method to quantify their effect in the parameter estimation procedure. This procedure is explained in full detail in Sec. 2.3 and Sec. 2.4. In Sec. 2.5 we establish the asymptotic key rate formula and discuss the necessary condition on which it relies. The last Results subsection, Sec. 2.6, is devoted to evaluate the rate-distance performance of our method for different values of the two parameters that characterize the correlations. In the Discussion section, Sec. 3, we summarize the con-tributions and limitations of our work, commenting on possible future directions. Lastly, Sec. 4.1, Sec. 4.2 and Sec. 4.3 are the Methods subsections, which include all the necessary technical derivations that support our results. The appendices attached at the end include a description of the channel model we use for the simulations, together with some complementary results.

Results
For illustration purposes, we consider a standard polarization encoding decoy-state BB84 protocol [15], although our results can be readily extended to any QKD protocol that relies on decoy-states. In each round k, with k = 1, . . . , N , the sender (Alice) selects a basis x k ∈ M = {X, Z} with probability q x k , a uniform raw key bit r k ∈ Z 2 = {0, 1}, and an intensity setting a k ∈ A = {µ, ν, ω} with probability p a k and µ > ν > ω ≥ 0. Note that the values of the probabilities q x k and p a k respectively depend on the basis and intensity settings only, but not on the particular round k. Then, she encodes the BB84 state defined by x k and r k in a PRWCP with intensity setting a k , and sends it to the receiver (Bob) through the quantum channel. Importantly, as explained below, the actual mean photon number of the pulse might not match the setting a k due to the intensity correlations. Furthermore, regarding the transmitter, we assume perfect phase randomization, no state preparation flaws and no side-channels for simplicity. Also, possible intensity correlations in the qubit encoding (which may arise, e.g., when using time-bin encoding) are neglected in this work.
Similarly, Bob selects a basis y k ∈ M with probability q y k (whose value, again, does not depend on the round k) and performs a positive operator-valued measure (POVM) on the incident pulse, given by {M y k ,s k B k } s k ∈{0,1,f } . Here, B k denotes Bob's k-th incoming pulse, s k stands for Bob's classical outcome and f stands for "no click". As usual, the basisindependent detection efficiency condition is assumed, such thatM Z,f B k =M X,f B k . Thus, we shall simply denote these two operators byM f B k . Note that this assumption could be removed by the use of measurement-device-independent (MDI) QKD [21] or twin-field QKD [22].

Characterizing the intensity correlations
Let us denote the record of intensity settings up to round k by a k = a 1 , a 2 , . . . , a k -where a j ∈ A for all j-and let α k denote the actual intensity delivered in round k. In full generality, α k is a continuous random variable whose probability density function is fixed by the record of settings a k . This function, which we denote as g a k (α k ), is referred to as the correlations model. Below, we list three assumptions about the intensity correlations on which our analysis relies. Assumption 1. As supported by GHz-clock QKD experiments [10,11], we shall consider that the correlations do not compromise the poissonian character of the photon number statistics of the source conditioned on the value of the actual intensity, α k . That is to say, for any given round k, and for all n k ∈ N, Assumption 2. For all possible records a k , we shall assume that Namely, for every round, we impose that . Thus, δ max defines the maximum relative deviation between a k and α k . Note that we are assuming here that the value of δ max does not depend on a k for simplicity, but such dependence could be easily incorporated in the analysis to obtain slightly tighter results. Also, we remark that a bound of the type of Eq. (2) has been quantified in a recent experiment reported in [11].
From Eq. (1) and Eq. (2), it follows that the photon number statistics of round k read for all n k ∈ N.
Assumption 3. We assume that the intensity correlations have a finite range, say ξ, meaning that g a k (α k ) is independent of those previous settings a j with k − j > ξ.
Beyond the assumptions presented here, we shall consider that g a k (α k ) is unknown, such that our results are model-independent.

Quantifying the effect of the intensity correlations
Here, we rely on the three assumptions introduced in Sec. 2.1 to account for the effect of intensity correlations in the decoy-state analysis. A key idea -originally presented in [16] to deal with Trojan horse attacks-is to pose a restriction on the maximum bias that Eve can induce between the n-photon yields associated to different intensity settings. For this purpose, we use a fundamental result presented in [23] and further developed in [20]. Since this result is a direct consequence of the Cauchy-Schwarz (CS) inequality in complex Hilbert spaces, we shall refer to it as the CS constraint. The reader is referred to the Methods Sec. 4.1 for a definition of this result, and below we present the relevant restrictions we derive with it.
Precisely, for any given round k, photon number n ∈ N, intensity setting c ∈ A and bit value r ∈ {0, 1}, we define the yield Y (k) n,c = p (k) (click|n, c, Z, Z) and the error probability H (k) n,c,r = p (k) (err|n, c, X, X, r), where the right-hand sides are shorthand for p(s k = f |n k = n, a k = c, x k = Z, y k = Z) and p(s k = f, s k = r k |n k = n, a k = c, x k = X, y k = X, r k = r), respectively. Then, a major result of this work is to show that, for any two distinct intensity settings a and b, their yields and error probabilities satisfy for all k, n and r, where , ξ stands for the correlation range and The full derivation of this result is given in the Methods Sec. 4.1.
Notably, Eq. (4) must be combined with a decoy-state method in order to estimate the numbers of counts and errors triggered by single photon emissions, which determine the secret key rate. For this purpose, in Sec. 2.3 we provide a decoy-state analysis that relies on assumptions 1 and 2 to deal with intensity correlations, and in Sec. 2.4 we present the resulting linear programs that fulfill the parameter estimation. In this regard, since the constraints of Eq. (4) are non-linear, first-order approximations with respect to some reference parameters are derived from them, which we refer to as the linearized CS constraints. Importantly, replacing the original functions by their linear approximations leads to looser but valid constraints too, thanks to the convexity of these functions.

Decoy-state method
The Z basis gain with intensity setting a is defined as if, in round k, both parties select the Z basis, Alice selects intensity setting a and a click occurs, and zero otherwise. Thus, where the yield Y n,a is defined in Sec. 2.2 and one can generically refer to the Z (k) a as the "expected gains of round k". Going back to Eq. (7), note that and in virtue of Eq. (3), the record-independent bounds p (k) (0|a) ∈ e −a + , e −a − and p (k) (n|a) ∈ e −a − a − n n! , e −a + a + n n! (n ≥ 1) (9) follow from the decreasing (increasing) character of e −x x n for n = 0 (n ≥ 1) in the interval x ∈ (0, 1). Explicitly using these intervals in Eq. (7), one obtains n,a (10) for all a ∈ A and k = 1, . . . , N . Further selecting a threshold photon number for the numerics, n cut , from Eq. (10) we have for all a ∈ A and k = 1, . . . , N , where in the second inequality we have used the fact that n,a e −a + a + n /n! ≤ 1 − ncut n=0 e −a + a + n /n!. Importantly, replacing Z by X everywhere, one obtains the corresponding analysis for the X basis gains and yields of round k.
On the other hand, similar constraints can be imposed on the error counts. To be precise, the number of X basis error counts with setting a is defined as where we defined H n,a,1 )/2. Now, making use of Eq. (9) as before and selecting a threshold photon number n cut , it follows that for all a ∈ A and k = 1, . . . , N . At this point, summing over k and dividing by N both in Eq. (11) and Eq. (13), one trivially obtains bounds for the average parameters y n,a, for a common threshold photon number n cut and for all a ∈ A.

Linear programs for parameter estimation
Even though Eq. (4) provides the relevant restrictions on the maximum bias that Eve can induce between different yields/error probabilities, it consists of a set of non-linear constraints unsuitable for parameter estimation via linear programming. As mentioned in Sec. 2.2 though, in virtue of the convexity/concavity of the functions that define the constraints, their first-order expansions around any given reference yield/error provide valid linear bounds as well. For instance, if we focus on the yields, we have that n,a ∈ (0, 1), irrespectively of which reference yields Y (k) n,a ∈ (0, 1) we select for the linear expansion. Also, note that the derivative functions G ± are well-defined for all Y (k) n,a ∈ (0, 1) because the G ± are smooth piecewise functions. In particular, n,a ∈ (0, 1), the linearized bounds are Identically, for any given reference n-photon error click probabilitiesH n,a,r ∈ (0, 1), the linearized versions of the corresponding constraints read If, in addition, we select reference parameters independent of r, sayH n,a for both r = 0 and r = 1, the linearized lower (upper) bound has the exact same slope and the exact same intercept for both r = 0 and r = 1. As a consequence, the relevant error probabilities entering the decoy-state analysis, H as we wanted to show. As a final comment, note that, for all practical purposes, one can restrict the reference parameters to be round-independent:Ỹ (k) n,a =ỹ n,a andH (k) n,a =h n,a for all k = 1, . . . , N . This being the case, summing over k and dividing by N in Eq. (16) and Eq. (18), one obtains respective inequalities for the average parameters y n,b, where, for conciseness, we define the intercepts and slopes Of course, the tightness of these linear bounds is subject to the adequacy of the selected reference yields, and thus it relies on a characterization of the quantum channel. Note, however, that aiming to further improve the results, one could incorporate more linearized CS constraints to the problem by using various reference yields for each pair (n, a), instead of just one. Also, we recall that simpler bounds not relying on any reference values can be derived by using the so-called trace distance argument [27], and this is what we do in Appendix B. Another alternative would be, of course, to solve a non-linear optimization problem given by the original CS constraints.
To finish with, we present the linear programs that allow to estimate the relevant singlephoton parameters, putting together the decoy-state constraints introduced in Sec. 2.3 and the above linearized CS constraints. In the first place, we address the average number of signal-setting single-photon counts, defined as Z 1,µ and a lower bound y L 1,µ,N on y 1,µ,N is reached by the following linear program: We recall that the c ± ab,n and the m ± ab,n are defined in Eq. (19). Needless to say, replacing Z by X everywhere one obtains the corresponding program for the average number of signal-setting single-photon counts in the X basis, X 1,µ,N , such that where the apostrophe here denotes that we refer to the X basis.
On the other hand, the average number of signal-setting single-photon error counts in the X basis is 1,µ , averaging over k it follows that and an upper bound h U 1,µ,N on h 1,µ,N is reached by the following linear program: Finally, we remark that, in virtue of the properties of linear optimization [29], y L 1,µ,N , y L 1,µ,N and h U 1,µ,N are linear in Z a,N , X a,N and E a,N , respectively, for all a ∈ A, which means that they provide the expectation of certain random variables respectively linear in Z a,N , X a,N and E a,N . In turn, this implies that the bounds reached by the linear programs can be written as for all N . This feature is crucial to justify the asymptotic approximation of the secret key rate presented next.

Asymptotic approximation of the secret key rate
The linear programs of Sec. 2.4 provide suitable bounds on the expectations of the relevant experimental averages, namely, the average number of signal-setting single-photon counts in the Z (X) basis after N transmission rounds, Z 1,µ,N (X 1,µ,N ), and the average number of signal-setting single-photon error counts in the X basis after N transmission rounds, E 1,µ,N . The bounds are of the form for all N , where Z However, the finite secret key rate relies on statistical bounds of the experimental averages themselves, say P Z 1,µ,N < Z L, 1 1,µ,N ≤ 1 , P X 1,µ,N < X L, 2 1,µ,N ≤ 2 and P E 1,µ,N > E U, 3 1,µ,N ≤ 3 (27) for given failure probabilities 1 , 2 and 3 (see the Methods Sec. 4.2 for a summarized derivation of the finite secret key rate). Crucially, in the absence of intensity correlations or side-channels possibly leaking the intensity setting information, commutativity allows to consider the so-called counterfactual setting, in which case the latter bounds -Eq. (27)are obtained from the former -Eq. (26)-via concentration inequalities for independent random variables, such as Chernoff's [24] or Hoeffding's [25]. Precisely, the independence of the relevant indicator variables attached to the detection events is enforced because, in the counterfactual setting, the intensities are randomly selected a posteriori (and thus decoupled) of the detection events. On the contrary, intensity correlations invalidate the counterfactual setting argument, in so invalidating the usage of the above concentration inequalities too. Nevertheless, propositions 1 and 2 in the Methods Sec. 4.3 establish that, as long as the variance of the experimental averages tends to zero as N tends to infinity, any violation of the equations -no matter how small-has an asymptotically null probability of occurring. This feature legitimizes the use of the bounds of Eq. (28) to asymptotically approximate the secret key rate, by plugging them into the finite secret key rate formula -Eq. (61) in the Methods Sec. 4.2-. This yields for large enough N , with secrecy parameter sec ≈ 2 S + PA + δ and correctness parameter cor (see the Methods Sec. 4.2 for the definitions of the security parameters S , PA and δ). Also, h(·) denotes the binary entropy function, f EC is the efficiency of the error correction protocol and E tol is a threshold bit error rate. Note that Eq. (29) assumes that Alice and Bob use the Z (X) basis events for key generation (parameter estimation).
Having reached this stage, one can remove both the dependence on N and on the security parameters by neglecting the Serfling deviation term (which scales as N −1/2 ) and the finite key term log 1/ cor 2 PA δ /N , in so obtaining the final asymptotic secret key rate formula Notably, as mentioned, the usefulness of this asymptotic approximation is subject to the non-trivial condition that the variance of the experimental averages vanishes as N tends to infinity. Precisely, for a sequence {X j } with successive averages as long as the latter is finite. Thus, in particular, Eq. (30) is an asymptotic approximation of the secret key rate provided that 1,a } a∈A (i.e., provided that the preconditions of propositions 1 and 2 in the Methods Sec. 4

.3 hold). For instance, if
Cov [X i , X j ] = 0 for all i, j with |i − j| > ζ -where ζ denotes a finite round differencethe condition holds despite the fact that the X j may be dependent and non-identically distributed.

Simulations
In the absence of real data, we fix the experimental inputs Z a,N /q 2 Z p a , X a,N /q 2 X p a and E a,N /q 2 X p a of the linear programs to their expected values according to a typical channel model. Importantly, although the security analysis contemplates intensity correlations, we adopt a standard channel and transmitter model without correlations for ease of comparison with prior work. In particular, let η det (η ch = 10 −αattL/10 ) denote the detection efficiency of Bob's detectors (transmittance of the channel), where α att (dB/km) is the attenuation coefficient of the channel and L (km) is the distance between Alice's and Bob's labs.
Also, let p d (δ A ) stand for the dark count probability of each of Bob's photo-detectors (polarization misalignment occurring in the channel). The model is [15,26] Z a,N for a ∈ A, where η = η det η ch and we define h η,a,δ A = e −ηa cos 2 δ A − e −ηa sin 2 δ A /2. Also, we introduce the variable E a,N (Z) , which is equivalent to E a,N but referred to the Z basis error clicks. Note that Eq. (32) accounts for the fact that multiple clicks are randomly assigned to a specific detection outcome. For simplicity, the tolerated bit error rate of the sifted key is set to E tol = E µ,N (Z) / Z µ,N . In addition, we remark that the channel model can also be used to select reference valuesỹ n,a,N andh n,a,N for the evaluation of the linearized CS constraints of Sec. 2.4. Nevertheless, one could also choose the reference values based on previous executions of the protocol instead. Note that, in a real QKD experiment, these reference values would be required for the parameter estimation to go through, and so the secret key rate would be sensitive to the selectedỹ n,a,N andh n,a,N . The reader is referred to Appendix A for the explicit formulas ofỹ n,a,N andh n,a,N that we use, matching the typical channel model under consideration. Alternatively, in Appendix B we provide a looser analysis based on the trace distance argument [27], which does not rely on the selection of any reference values.
Either way, plugging Eq. (32) into Eq. (21) and Eq. (24), we find that the asymptotic secret key rate formula K ∞ (presented in Eq. (30)) does not depend either on the probability of the decoy settings, p ν and p ω , or on the probability of selecting the X basis, q X , in such a way that setting p µ ≈ 1 and q Z ≈ 1 maximizes K ∞ with the typical channel model under consideration. This feature corroborates the intuition that, as N increases, one can devote larger and larger fractions of rounds to key generation without compromising the tightness of the parameter estimation. Lastly, regarding the experimental parameters, we list them below. We take η det = 0.65, p d = 7.2 × 10 −8 -both values matching the recent experiment reported in [6]-, a typical attenuation coefficient α att = 0.2 dB/km and a standard error correction efficiency of f EC = 1.16. Regarding the misalignment, we take δ A = 0.08 for illustration purposes. Also, we fix the weakest intensity setting to ω = 10 −4 for the numerics, and we numerically optimize µ and ν to maximize K ∞ for each value of the distance L. Lastly, three different correlation ranges are contemplated, ξ = 1, ξ = 2 and ξ = 5, each of which is combined with various values of the maximum relative deviation δ max (see assumptions 2 and 3 in Sec. 2.1 for the definitions of ξ and δ max ).
The rate-distance performance with the above considerations is shown in Fig. 1. As seen in the figure, intensity correlations strongly limit the maximum distance attainable for QKD, and the secret key rate is notably sensitive to the deviation parameter, δ max . On the contrary, as long as moderate values are considered for the correlation range ξ, the effect of this parameter on the secret key rate is softer. Finally, for completeness, in Appendix C we show that an enhancement of the secret key rate is possible by assuming deterministic intensity correlations, in contrast to the model-independent correlations considered so far.

Discussion
Aiming to enhance the performance of QKD systems, it is crucial to develop ultrafast clock rate QKD devices capable of delivering high secret key rates for widespread applications. However, even for GHz clock rates, QKD transmitters exhibit intensity correlations [11,10] that invalidate standard decoy-state analyses for parameter estimation. As opposed to many other source imperfections, only limited solutions were known for this security loophole so far [10,17,18]. In this work, we solve the problem by quantifying the maximum effect of intensity correlations in the security of QKD. For this purpose, we introduce two experimental-friendly security parameters that allow to characterize arbitrary correlations in the IM. Importantly, our technique builds on a result that we refer to as the Cauchy-Schwarz constraint (recently used in [20,28]), which provides tighter bounds on the indistinguishability of non-orthogonal quantum states than the well-known trace distance argument [27]. For illustration purposes, our analysis is dedicated to the standard decoy-state BB84 protocol with one signal and two decoy settings [15], although we remark that our results can be easily generalized to deal with other variants of the protocol, or even with the decoy-state measurement-device-independent (MDI) QKD scheme [21].
As a related contribution, we present a non-standard derivation of the asymptotic limit, in so revealing a necessary condition to justify the ubiquitous asymptotic formula. Crucially, this condition becomes non-trivial in the most general context of coherent attacks and arbitrary pulse correlations. Nevertheless, if, for instance, Eve's attack does not interrelate arbitrarily distant detection events -but the interaction is limited to a finite round difference-the condition holds. In this regard, it is not unreasonable to conjecture that, as long as the correlation range of the light pulses is not larger than ξ, Eve may reach an optimal cheating strategy by attacking blocks of maximum round difference ζ = ξ. Indeed, a hypothesis of this kind might pave the way for a finite key analysis of the problem, which is the natural direction to follow. In any case, whether or not this conjecture is true, the solution here presented clearly provides an insightful step towards foolproof security of high speed QKD systems, with their imperfections.
Theorem [20]. Let |u and |v be pure states of a certain quantum system. Then, for all positive operatorsÔ ≤ I, where the functions G ± are defined in Eq. (5). As pointed out in Sec. (2.2), for all k = 1, . . . , N , all n ∈ N, and any given pair of settings, a ∈ A and b ∈ A with b = a, Eq. (33) allows to constrain the maximum deviation that Eve can induce between the n-photon yields p (k) (click|n, a, Z, Z) = p (k) (click|n, a, Z) and where we have invoked the basis-independent detection efficiency assumption to remove the conditioning on Bob's basis choice. Here, we derive the specific constraint, namely, Eq. (4), which contemplates fully general coherent attacks and finite range intensity correlations.
In an entanglement based view of the protocol, the global input state describing all the protocol rounds reads where we introduce the notation a N 1 = a 1 . . . a N , and equivalently for x N 1 and r N 1 . Also, for all i, |a i , x i A i |a i ∈ A, x i ∈ M and |r i A i |r i ∈ Z 2 are orthonormal bases of Alice's i-th registers, A i and A i . Similarly, we define where C i denotes an inaccessible purifying system with orthonormal basis |t n i C i |n i ∈ N for all i (C i stores the photon number information of the i-th signal that Alice sends to Bob), B i denotes the system delivered to Bob (|n x i ,r i i B i standing for a Fock state with n i photons encoding the BB84 polarization state defined by (x i , r i )), and the photon number statistics p n i | a i are defined in Eq. (3). Lastly, |0 E in Eq. (34) stands for the initial state of Eve's ancillary system.
If we denote Eve's coherent interaction with systems B 1 , . . . , B N and E byÛ BEsuch thatÛ BE |Ψ represents the global state prior to Bob's measurements-and refer to Bob's "click" POVM element in round k asM click Note that, in the derivation of Eq. (36), we make use of the fact that projection operators are "self-squared", together with the cyclic property of the trace and straightforward commutation relations. Then, we trace out systems A k and C k explicitly, in order to obtain the necessary input of the CS constraint later on. Further defining |Ψ a,Z,n , Eq. (36) can be restated as and since p (k) (n, a, Z) = Tr P |a,Z, , it follows from Bayes rule that Recalling that p (k) (click|n, a, Z) = p (k) (click|n, a, Z, Z) =: Y (k) n,a , which is the n-photon yield of round k associated to the intensity setting a, it follows from Eq. (33) that , which we do next. For k = 2, . . . , N − 1 (the cases k = 1 and k = N will be discussed separately), we have that with a k = {a j |j = k} and x k = {x j |j = k}. Thus, it follows that for k = 2, . . . , N − 1, where we have made use of the fact that ψ x i ,r i a i (a k =b) |ψ x i ,r i a i (a k =a) B i C i is independent of x i and r i for all i -which is straightforward to show from Eq. (35)-in order to carry out the sums over x k and r N 1 : Also for this reason, in the last equality we have renamed ψ Thus, for the normalized states, we have for k = 2, . . . , N − 1, where we have introduced the obvious definition (44) Lastly, regarding the extreme rounds (which are excluded from Eq. (43)), explicit calculation shows that We remark that, so far, we have not imposed Assumption 3 on the intensity correlations yet (see Sec. 2.1). At this stage, we invoke it by considering a finite correlation range ξ, which allows to rewrite Eq. (43) as for k = 2, . . . , N − 1, and similarly for Eq. (45). Crucially, we have made use of the fact that ψ a i (a k =b) |ψ a i (a k =a) B i C i = 1 for all i > k + ξ , which is a straightforward consequence of Assumption 3. Aiming to evaluate Eq. (40), one must lower-bound the right-hand side of Eq. (46). For this purpose, we are going to exploit the structure of Eq. (3) in order to derive modelindependent bounds valid for all correlations models g a k (α k ). Let us address the bracket in Eq. (46) first. Noticing that e −x x n is strictly decreasing (increasing) for n = 0 (n = 1, 2, . . .) in the interval x ∈ (0, 1), from Eq.
which becomes a global prefactor in Eq. (46), as it does not depend on the remaining summation indexes a max{k−ξ,1} to a k−1 . If we now focus on the first row of Eq. (46), the same monotonicity argument yields  45), it is clear that the resulting bound also applies to the extreme rounds k = 1 and k = N (the bound is simply less tight in these cases).
Lastly, to conclude the proof of Eq. (4), we need to establish the same result for the n-photon error click probabilities too, i.e., we need to show that a,X,r,n = a, X| A k r| A k t n | C k |Ψ . Thus, in virtue of the CS constraint -given in Eq. (33)-we have that and Eq. (50) follows from the fact that | Ψ a,Z,n | for both r = 0 and r = 1 and for any given n, k, a and b, which is easy to show following the same steps that lead to Eq. (43).

Phase error rate and secret key length in the finite key regime
The phase error rate is defined as φ 1,Z,N := E Z,ph 1,µ,N /Z 1,µ,N , where E Z,ph 1,µ,N is the number of phase errors among all Z 1,µ,N single-photon events contributing to the sifted key. In this regard, we recall that a phase error is a bit error in a virtual entanglement-based protocol where, for the sifted key rounds, the parties measure their ancillas in the X basis instead. Let us define the set of rounds (54) The partition of M 1,µ,N into Z 1,µ,N and X 1,µ,N is common to both the actual and the virtual protocol. In the virtual protocol, a specific number of bit errors occurs in M 1,µ,N , given by the number E Z,ph 1,µ,N of errors in Z 1,µ,N plus the number E 1,µ,N of errors in X 1,µ,N . Basis-independence of the single-photon states delivered by Alice in the rounds indexed by M 1,µ,N implies that Eve cannot distinguish test single-photons (i ∈ X 1,µ,N ) from key single-photons (i ∈ Z 1,µ,N ). Moreover, since, in the virtual protocol, Alice and Bob measure their ancillas in the X basis in both types of rounds, for any given round in M 1,µ,N the probability that it yields an error is independent of its round-type in the virtual protocol. Thus, one can imagine that Eve is inducing the bit errors in M 1,µ,N first, and later on Alice and Bob randomly select a partition M 1,µ,N = Z 1,µ,N ∪ X 1,µ,N , such that and one can derive a statistical upper bound on the difference |φ 1,Z,N − E 1,µ,N /X 1,µ,N | via Serfling's inequality [31]. For our purposes, the relevant one-sided bound can be stated as [30] (56) Now, let the following inequalities be given: At this stage, one can present the secret key rate of the decoy-state BB84 protocol under consideration, which relies on a lower bound on Z 1,µ,N (presumed in Eq. (57)) and an upper bound on φ 1,Z,N (Eq. (58)). Precisely, for any δ ∈ (0, 1), it is known that privacy amplification allows to extract an sec -secret, cor -correct secret key of length [15] as long as sec ≥ 2ε + PA + δ, where f EC is the efficiency of the error correction protocol, Z µ,N provides the length of the sifted key (see Sec. 2.3 for the definition of Z µ,N ), h(·) denotes the binary entropy function, E tol is a threshold bit error rate for the error correction, PA is the error probability of the privacy amplification and ε = S + 3 i=1 i is the parameter estimation error, i.e., an upper bound on the total error probability of the parameter estimation. Of course, the secret key rate is defined as K N = l/N . That is to say, where we have introduced the notation Y = Y /N .

Technical claims on the asymptotic regime
The asymptotic secret key rate formula given in Sec. 2.5 builds on the assertion that, as long as the variance of the experimental averages tends to zero as N → ∞, the probability of any finite violation of Eq. (28) vanishes for N → ∞ too. Propositions 1 and 2 below formally demonstrate this claim. Proof. Let us consider the event E δ,N = Z L 1,µ,N − Z 1,µ,N ≥ δ . We have where in the first set bound we used the triangle inequality twice, in the second one we used the fact that Z L 1,µ,N − Z 1,µ,N ≤ 0 for all N -according to the first decoy-state bound in Eq. (26)-and in the third one we used the fact that, if |X| + |Y | ≥ δ, then either |X| ≥ δ/2 or |Y | ≥ δ/2. Now, in virtue of the union bound, we have that Therefore, the claim holds if we show that both terms in the right-hand side tend to zero as N tends to infinity for all δ > 0. Recalling that, in virtue of Chebyshev's inequality [32], mean-square convergence of a sequence of random variables guarantees convergence in probability, for the second term of Eq. (63) we have for all δ > 0. Regarding the first term, note that Z L 1,µ,N is linear in the Z a,N (see Sec. 2.4). That is to say, Z L 1,µ,N = a∈A c a Z a,N + C for certain coefficients c a and C. Thus, and since [32] for some positive constant K, such that for all δ > 0, where again we invoked the fact that mean-square convergence implies convergence in probability.
The proof of Proposition 2 follows identically as that of Proposition 1.
As a final comment, note that, when dealing with bounded sequences of random variables -{X j } is bounded if there exists some constant C such that |X j | < C for all jmean-square convergence is not stronger but exactly equivalent to convergence in probability (see for instance [32]), such that demanding the latter kind of convergence instead does not relax the preconditions of propositions 1 and 2. If, alternatively, neither kind of convergence is demanded, all we know is that Eq. (26) holds for the expectations, which does not suffice to establish the limits of propositions 1 and 2.

Data availability
No datasets were generated or analysed during the current study.

Acknowledgements
We thank Margarida Pereira for very fruitful discussions. This work was supported by the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 675662 (project QCALL), by the Galician Regional Government (consolidation of Research Units: AtlantTIC), the Spanish Ministry of Economy and Competitiveness (MINECO), the Fondo Europeo de Desarrollo Regional (FEDER) through Grant No. PID2020-118178RB-C21, and the Spanish Ministry of Science and Innovation through the "Planes Complementarios de I+D+I con las Comunidades Autónomas" in Quantum Communication. V.Z. and A.N. acknowledge support from respective FPU pre-doctoral scholarships from the Spanish Ministry of Education. K.T. acknowledges support from JSPS KAKENHI Grant Numbers JP18H05237 18H05237 and JST-CREST JPMJCR 1671.
7 Author contributions M.C. and K.T. conceived the initial idea and triggered the consideration of this project. V.Z. and A.N. made the theoretical analysis and performed the numerical simulations, with inputs from all authors. All authors analysed the results and prepared the manuscript.

Competing interests
The authors declare no competing interests.

B Trace distance argument
The trace distance (TD) argument is stated as follows.
Theorem [27]. Let ρ and σ be two distinct states of a certain quantum system. Then, the trace distance between ρ and σ satisfies D(ρ, σ) = max{Tr(Ô(ρ − σ))}, where the maximization is taken over all positive operatorsÔ ≤ I. On the one hand, the dashed lines are obtained using the TD argument, showing the asymptotic secret key rate, K ∞ , as a function of the distance, L, for various values of the maximum relative deviation between intensity settings (a k ) and actual intensities (α k ), δ max ∈ {10 −6 , 10 −4 , 10 −2 }. On the other hand, the solid lines represent the corresponding secret key rates obtained with the CS inequality instead. Although these latter lines also appear in Fig. 1 of the main text, for clarity purposes the color and line style criteria are different here. In addition, we include the attainable secret key rate in the absence of intensity correlations for completeness (dotted black line). Regarding the experimental parameters, they are fixed identically as in Fig. 1 of the main text.
Keeping the notation Y (k) n,a = p (k) (click|n, a, Z, Z), and making use of the fact that D(|x x| , |y y|) = (1 − | x|y | 2 ) 1/2 [27], the bound provided by the TD argument reads [16] for all a ∈ A, b ∈ A (b = a), n ∈ N and k = 1, . . . , N . Here, we have used the lower bound presented in the main text, which depends on a presumed finite correlation range ξ. Remarkably, Eq. (72) does not rely on a characterization of the quantum channel, as opposed to the linearized CS constraints. What is more, the TD argument provides equivalent constraints for the n-photon error click probabilities too, as seen next. In the first place,  Fig. 3, we contemplate three different values of the maximum relative deviation between intensity settings (a k ) and actual intensities (α k ), δ max ∈ {10 −6 , 10 −4 , 10 −2 }. Similarly, the solid color lines represent the corresponding secret key rates in the model-independent scenario. Importantly, three correlation ranges are used, namely, ξ ∈ {1, 2, 5}. However, the effect of this parameter on the secret key rate is negligible in the deterministic model for such moderate values, and thus we only plot ξ = 1 in that case. As a reference, we provide the attainable key rate in the absence of intensity correlations too (black dotted line). Regarding the experimental parameters, they are common with those of Fig. 1 in the main text.
To finish with, Fig. 4 illustrates how the asymptotic secret key rate K ∞ is enhanced when one moves from the model-independent scenario of Fig. 1 in the main text to a deterministic model. Remarkably, as seen in the figure, this model is also more robust to the correlation range ξ than the model-independent setting.