distribution with single-photon sources.

Device-independent quantum key distribution protocols allow two honest users to establish a secret key with minimal levels of trust on the provider, as security is proven without any assumption on the inner working of the devices used for the distribution. Unfortunately, the implementation of these protocols is chal-lenging, as it requires the observation of a large Bell-inequality violation between the two distant users. Here, we introduce novel photonic protocols for device-independent quantum key distribution exploiting single-photon sources and heralding-type architectures . The heralding process is designed so that transmission losses become irrelevant for security. We then show how the use of single-photon sources for entanglement distribution in these architectures, instead of standard entangled-pair generation schemes, provides signiﬁcant improvements on the attainable key rates and distances over previous proposals. Given the current progress in single-photon sources, our work opens up a promising avenue for device-independent quantum key distribution implementations.


Introduction
The paradigm of device-independent quantum key distribution (DIQKD) offers the strongest form of secure communication, relying only on the validity of quantum mechanics, but not on any detailed description, or trust, of the inner workings of the users devices [1][2][3]. On the theoretical side, the security of DIQKD has been proven against increasingly powerful eavesdroppers [4,5], culminating in proofs of security against attacks of the most general form [6,7].
The main challenge facing experimental DIQKD are its stringent demands on the observable data, necessary for the security requirements to be met. First, any DIQKD implementation should be based on the observation of data that conclusively violates a Bell inequality [8,9]. In particular, the Bell experiment should close the so-called detection loophole [10], otherwise, hacking attacks can fake a violation at the level of the detected events when losses are high enough [11]. Moreover, a detection-loopholefree Bell violation is necessary but not sufficient for secure DIQKD, as the necessary detection efficiencies are significantly higher than those required for Bell violation. For instance, while the detection efficiency for observing a Bell violation of the Clauser-Horne-Shimony-Holt (CHSH) [12] inequality can be as low as 2/3 [13], a DIQKD protocol based on CHSH requires an efficiency of the order of 90% [3]. This is, in fact, a general feature of any noise parameterconsider, e.g. the visibility [4]-that affects not only the observed Bell violation, but also the correlations between the users aiming to construct the secret key.
The first Bell experiments closing the detection loophole used massive particles [14][15][16][17]. Leaving aside table-top [14,15] and short-distance [16] experiments, the Bell test of Hensen et al. [17] involved labs separated by a distance of 1.3 km, which allowed to close also the locality loophole [9]. Nevertheless, as the employed light-matter interaction processes typically deteriorate the quality of the nonlocal correlations generated between the users, the reported violations would not have been sufficient for secure DIQKD. Furthermore, the rates of key distribution they could provide are seriously limited owing to the measurements involved that, despite allowing for near unit efficiency, take significant time [18,19]. While improvements are to be expected in all these issues, and massive particles may be essential for longdistance schemes involving quantum repeaters [20], photon-based schemes appear more suitable to obtain high key rates with current or near-future technology. Photonic losses, however, occurring at all of the generation, transmission, and detection stages represent the main challenge in these schemes. Recent advances have been made for photo-detection efficien-cies, which allowed for the first loophole-free photonic Bell inequality violations over short distances [21][22][23][24]. Still, not only are the reported distances far from any cryptographic use, but also the observed Bell violations are again not large enough for secure DIQKD.
In this work, we show that single-photon sources [25] constitute a promising resource for experimental photonic DIQKD. Such sources have already allowed for nearly on-demand [26], highly efficient [27] extraction of single photons (also in pulse trains [28,29] as well as at telecom wavelengths [30]), while maintaining their purity and indistinguishability even above the 99% level [31,32]. We propose novel DIQKD photonic schemes that thanks to the replacement of the photon-pair creation process (achieved, e.g. by parametric downconversion [33]) with single-photon sources allow to distribute the key at significant rates over large distances. We believe that, in view of the recent advances in the fabrication of single-photon sources, our results point out a promising avenue for DIQKD implementations.
The remainder of the paper is organized as follows. In Sec. 2.2, we describe the technique of evading transmission losses in DIQKD protocols by means of heralding and, furthermore, discuss the crucial implications the heralding method has for designing photon-based architectures. Subsequently, in Sec. 3, we introduce two heralded schemes employing singlephoton sources, which allow for fine-tuning of the final shared entangled state, important for achieving optimal efficiencies. We then discuss in Sec. 4 how to quantify the attainable key rates within a heralded scheme, which importantly are then guaranteed to be fully secure. Finally, in Sec. 5 we apply our analysis to the two schemes proposed, in order to study their performance, in particular, the key rates, separation distances, as well as noise levels they allow for in DIQKD. We conclude our work in Sec. 6.

Losses in DIQKD
For non-negligible key rates to be achievable over large distances in DIQKD, solutions must be proposed that pinpoint and disregard-without opening the detection loophole-inconclusive protocol rounds that arise due to photons being inevitably lost. From the perspective of maintaining security (i.e. only the question of non-zero key rate), it is convenient then to divide photonic losses into two categories. Losses that occur within the local surroundings-laboratories-of the users should be differentiated from those that occur during the transmission of photons between the labs. Laboratories represent then regions of space from, and into which, the users control the information flow, i.e. provide the local privacy requisite for any secure communication [34].
As a result, one may design DIQKD protocols that target explicitly the transmission losses and allow for Bell violations over arbitrary distances between the users [35,36]. Other approaches have also been proposed that, while stemming from novel entropic uncertainty relations which account for quantum side information [37,38], require only local Bell violation within one of the labs [39]. This, however, comes at the price of security being guaranteed only up to a finite distance separating the users for a given fixed, even arbitrarily small, level of local losses (also in the absence of detector dark-count events [40]). In this work, our goal is to propose optical schemes where the security can be guaranteed independently of the distance between the users. Secondary to this, for a scheme to be practical, we furthermore want the resulting key rate to scale favourably with the separation, in order to achieve non-negligible key rates over large distances.

Local losses
We parametrise local losses by the effective local efficiency, η l , which accounts for all photon-loss mechanisms inside the lab, including imperfect photodetection, any optical path and mode mismatch, finite photon-extraction efficiency of the sources locally employed by a user, etc. To our knowledge, all known DIQKD protocols require a high local efficiency, of the order of 90% [36,39,[41][42][43][44]. While the existence of practical DIQKD protocols tolerating lower local efficiencies cannot be excluded, we do not expect any significant improvement in this direction. This is a consequence of the following simple argument.
A generic DIQKD protocol is based upon the observation of some ideal correlations described by a set of joint probability distributions p = {P (ab|xy)} abxy shared by the two users, Alice and Bob, that aim to establish the secret key. The input random variable, x (y), labels the measurement setting, i.e. the measurement that Alice (Bob) has chosen, while the output, a (b), stands for the outcome of her (his) measurement. In the presence of local losses, parametrised by η l , there is an additional outcome, labelled by φ, corresponding to the 'no-detection' event. The resulting correlations observed, p η l , where a and b refer only to 'conclusive' events, are P η l (ab|xy) = η l 2 P (ab|xy) where P A and P B denote the marginal probabilities detected by Alice and Bob in the ideal case (without loss). For simplicity, we take the local efficiencies equal for Alice and Bob and for all measurement settings, but the results can be easily generalized to non-equal local efficiencies. In any DIQKD protocol, Alice and Bob construct the key from the outputs of n k pairs of measurement settings (typically n k = 1, and the key is generated from the pair (x * , y * ) [36,[41][42][43][44]). In Ref. [45], a successful eavesdropping attack was constructed for a critical value of the losses equal to η c = 1/(n k +1) if n k < m, where m is the total number of measurement settings, and η c = 1/m when n k = m. To implement this attack, Eve needs to be able to control the detection efficiencies on one side, say Alice. Eve has perfect knowledge of the outputs of the n k measurements used by Alice for the key, while reproducing the expected correlations (1) for η l = η c .
For detection efficiencies above this critical value, η l > η c , Eve can use a combined strategy, in which the previous attack is applied on Alice's side with probability q A , while with probability 1−q A Eve does nothing on the measured state and shifts Alice's detection efficiency to one. This attack produces correlations between Alice and Bob of the form (1) when q A is At present, the asymptotic secret-key rate R of any DIQKD protocol in which the key is established by one-way classical communication reconciliation techniques is determined by the best-known lower bound of Arnon-Friedman et al. [7], which is valid for most general eavesdropping attacks and reads: Here H(x * |y * ) is the classical conditional Shannon entropy between Alice and Bob outputs when choosing any inputs (x * , y * ) used for the key and H(A|E) is the conditional von Neumann entropy between Alice's output and the quantum state in the hands of the eavesdropper, Eve. Crucially, the Bell violation observed by Alice and Bob allows them then to estimate (lower-bound) H(A|E) without making any assumptions about Eve [2][3][4][5][6][7]. However, as Eq. (2) applies to any attack, we can explicitly evaluate it for the strategy discussed above. Returning to correlations (1) that incorporate losses, we compute the conditional entropy H(x * |y * ). For the sake of simplicity, we perform this calculation for the common case of two-output measurements, while the correlations between Alice and Bob define a perfectly correlated bit in the absence of losses, so that H(x * |y * ) = 0 for η l = 1. For the above simple attack, we easily see that for η l ≥ η c , as Eve has then complete knowledge with probability (1 − q A ) and complete uncertainty with probability q A . In contrast, for η l ≤ η c the attack of Eve works all of the time, so that then H(A|E) = 0. Within the inset of Fig. 1 we plot explicitly both these conditional entropies for n k = 1 and m = 2.
In Fig. 1, we depict the critical values of the local efficiency, η * l , at which the key rate computed through (2) becomes zero as a function of the number of bases, n k , used to construct the key. Note that based on such an attack, the tolerable local efficiency is forced to be at least 85.7% for any DIQKD Figure 1: Lower bound on critical local efficiency, η * l , for DIQKD as a function of the number of measurement settings, n k , that are used to generate the key. For each n k and any η l < η * l below the corresponding value (blue dot), there exists a simple attack based on the eavesdropping strategy introduced in Ref. [45] that prevents any protocol based on two-party correlations (1) from being secure. The most and the least optimistic critical efficiencies for n k → ∞ and n k = 1, respectively, are also marked in blue. For n k = 1, the conditional entropies whose difference determines the key rate (2) are explicitly shown in the inset. scheme with n k = 1, e.g. the ones of Refs. [36,[41][42][43][44] employing one-way communication. Moreover, the above simple attack-with its corresponding critical local efficiencies applying to any DIQKD protocol and any Bell inequality which uses the security proof of Ref. [7]-demonstrates that even in the unrealistic case of users employing an infinite number of bases n k → ∞ (see Fig. 1) the local efficiencies must necessarily exceed 82.2% for a positive key rate to be possible.
In view of these results, we expect that high efficiencies will inevitably be needed in DIQKD protocols, and the only solution we foresee is to develop even more efficient photon sources [46][47][48], better detectors [49][50][51][52] and improve all the couplings within optical implementations to sufficiently decrease losses within the users' laboratories. Nevertheless, we expect Bell experiments with local losses of the order of 90% to be within reach in the near future. In this work, we work under this assumption, which is currently essential for any existing DIQKD implementation.

Transmission losses
The second type of losses occur while photons propagate outside the labs and are quantified by the transmission efficiency, η t , of the channel connecting the users. In principle, they constitute the main hurdle for long-distance DIQKD, as η t decreases rapidly with distance, e.g. exponentially when transmitting signals over optical fibres. Moreover, even if fibre technology progresses, the exponential increase of losses with distance will remain, due to unavoidable light absorption and scattering. However, contrary to lo-cal losses, transmission losses can be completely overcome by adopting a carefully constructed protocol. A viable route to do so is to record an additional outcome, denoted by , indicating in a heralded way that the photons did not get lost [35,36,53]. Then, assures that the required quantum state was successfully transmitted between Alice and Bob and the Bell test can be performed. If the heralding outcome is causally disconnected from the choices of measurement settings x, y by Alice and Bob during each round of the protocol (see Fig. 2), transmission losses become irrelevant with respect to the security of the protocol, affecting only the key rate. In fact, the heralding signal, when causally disconnected from the choice of measurements, can simply be interpreted as a probabilistic preparation of the required state which does not affect a Bell test, nor any protocols based on it.
The heralding process can in principle be implemented with the help of a quantum non-demolition (QND) measurement allowing the number of photons to be measured without disturbing the quantum state [54]. QND photon measurements are, however, challenging, requiring e.g. unrealistic optical non-linearities. The solution is to replace them with optical linear circuits that achieve the same goal in a probabilistic fashion [55,56]. The heralding signal is then provided by a particular detection pattern in the linear optics circuit indicating, as for the QND measurement, that the outputs produced by the Bell test are valid.
Within the side-heralding (SH) scenario depicted in Fig. 2(a), the circuit is performed by one of the users who records the rounds in which the positive heralding pattern, , has occurred, so that only these are later used for key extraction. In contrast, in the central-heralding (CH) scenario the heralding is performed outside of the users labs, at a central station (resembling the entanglement swapping configuration [57]), by a third party who then publicly announces which rounds should be considered successful, as illustrated in Fig. 2(b). In either case, the heralding scheme should be causally disconnected from the measurements in the Bell test. This condition is more natural in the CH scheme, within which it is naturally assured by the lack of information leakage from the secure user labs. On the contrary, in the SH configuration it becomes the responsibility of the user holding the heralding device within their lab, who must then, e.g. ensure that the heralding signal occurs before a random choice of measurement is made 1 . In any case, the heralding signal should work as the ideal QND measurement and assure that, up to the leading order, transmission losses have no effect on the heralded Bell violation.
The importance of this requirement is best un-1 Ideally, each user possesses an independent source of private randomness [58].  Figure 2: Efficient heralding schemes for DIQKD. Alice and Bob are located at isolated labs (shaded regions) from which they control information leaks. They locally use sources S, to distribute a quantum state between their labs and perform on it randomly sampled measurements labelled x and y, producing outcomes a and b. The measurement devices are treated as black-boxes that yield a joint probability distribution p = {P (ab|xy)} abxy compatible with the laws of quantum physics. A heralding scheme is implemented, such that, given its positive outcome , the resulting p(ab|xy ) shared between Alice and Bob becomes effectively independent of the finite transmission efficiency.
In the side-heralding (SH) scenario, (a), this is achieved by one of the users performing a (probabilistic) quantum non-demolition measurement (QND) within their isolated lab that verifies the arrival of the distributed state, without disturbing it. In the central-heralding (CH) scheme, (b), the heralding is performed by a third party that later publicly announces the successful rounds that should be used during the protocol.
derstood by considering existing proposals for photonic DIQKD that do not satisfy it, such as the schemes using a noiseless qubit amplifier [36] or entanglement swapping relays [41][42][43]. In all these schemes, entanglement between users is distributed using spontaneous parametric down-conversion [33] (SPDC) sources-a probabilistic process in which multi-photon pair creation also takes place. For the sake of argument, let us consider the state produced by the SPDC to read: |0 0| +p|ψ AB ψ AB |; after, without loss of generality, ignoring its normalisation and the higher-order terms inp, i.e. the spurious contributions arising when more than one photon-pair is created within the process-see App. A. For all the schemes, the state shared by the users after a successful heralding takes a general form (up to irrelevant normalisation): in which the detrimental terms of orderp that yield deviations from the target |ψ AB may also be omitted. The parameter λ > 0 above is determined by the particular heralding scheme [36,[41][42][43], while η t is the transmission efficiency dependent on the distance between the users.
The key point is to notice that the contribution of the maximally entangled state, |ψ AB , in Eq. (3) occurs at a higher order inp than the vacuum contribution and is influenced differently by the pres-ence of transmission losses. Thus, for any fixed λ, the Bell violation strongly depends on η t . That is, contrary to when performing an ideal QND measurement, transmission losses not only affect the key rate but also the protocol security. In optical fibres, η t vanishes exponentially with the separation distance L (η t = e −L/Latt , with typical values of the attenuation length L att ≈ 20 km), and so the heralded state (3) approaches the vacuum exponentially with L, while rapidly ceasing to produce large enough Bell violations for DIQKD to be possible [44]. In particular, this implies that in all such protocols there is always a critical distance at which the protocol ceases to be secure.
For the sake of clarity, we provide some simple estimations that make this point more explicit for the scheme based on the qubit amplifier of Ref. [36]. As discussed also in App. B, if one approximates for simplicity 1 −p ≈ 1, the state after heralding can be put in the form of Eq. (3) with λ = T /(1 − T ), where T is the transmittance of the beam-splitter used in the qubit amplifier for the heralding process [36]. Even if quite optimistic, we can take a value of T = 1 − 10 −2 , which gives λ ≈ 10 2 . This severely affects the key rate, which is a function of 1 − T , but here we focus on the protocol security.
For any protocol based on the violation of the CHSH inequality S loc ≤ 2 (e.g. the one of Ref. [36]), the heralded state (3) must lead to Following Ref. [2], the inequality (4) allows one to lower-bound the term H(A|E) is the binary entropy. On the other hand, for key-generation rounds, the conditional entropy is the effective probability of sharing the target state |ψ AB , given the state (3). Putting all these terms together, using the expression for the losses as an exponential function of the distance, and taking an optimistic value ofp = 10 −2 for SPDC [23,24], the key rate (2) vanishes already for distances of approximately one attenuation length, L att ≈ 20 km. This critical limit on user separation can be improved by taking even smaller values of T or larger values ofp, but the problem still remains: for any given values, the weight of the entangled part in the state obtained after successful heralding always decreases exponentially with distance.
The key rates reported in Ref. [36] are much higher than those obtained in the above. This arises due to the fact that the authors of Ref. [36] make additional assumptions on the attacks available to the eavesdropper (see for example the Supp. Mat. of Ref. [36] and also the discussion in Ref. [44]). Using these assumptions, they derive a different bound on the key rate as a function of an observed CHSH Bell violation and the rates at which one or both parties observe inconclusive events. The same bound was later used in the protocols of Refs. [41][42][43]. Unfortunately, it is unclear whether these assumptions, and corresponding rates, do not imply a loss of generality. In fact, for a slightly different situation in which losses only occur for one of the observers, these assumptions and corresponding bounds can be explicitly proven not to hold: for some value of the losses they predict a strictly positive secret-key rate, while it is possible to derive an explicit eavesdropping attack that breaks the protocol. The details of this attack are shown in App. C. This analysis implies that the assumptions used in Ref. [36], and later in Refs. [41][42][43], do not hold in full generality and, therefore, it is unclear to what extent the secret-key rates reported in these works are valid.
In what follows, we propose two DIQKD architectures based on single-photon sources [25] that crucially do not suffer from the above problems. They are designed such that up to the leading order a pure entangled state is shared between the users upon successful heralding -independent of their separation (or transmission losses). Our protocols thus behave as the ideal QND measurement and allow high key rates to be maintained over large communication distances. One of the schemes relies solely on single photon sources and a CH-based implementation. Since single-photon sources are still an expensive resource compared to widely used SPDC sources, we furthermore consider a SH-based scheme in which both source-types are used in conjunction. In order to maintain generality and a degree of comparison with the SPDC framework [33], each single-photon source is modelled to produce a quantum state that, when ignoring normalisation (see also App. A), reads σ SP = ∞ n=1 p n−1 |n n| in the photon-number basis, containing an infinite tail of high-order contributions whose probability is parametrised by p.

DIQKD schemes with single-photon sources
The SH scheme requires Bob to produce two single photons with orthogonal polarizations H and V , while Alice has access to entangled photon-pairs produced by an SPDC source. It is inspired by the qubitamplifier implementation of Pitkanen et al. [44], as shown in Fig. 3(a). Bob's photons enter a beamsplitter (BS) of transmittance T . Then, the reflected light component passes through a half-waveplate (HWP) before being detected in conjunction with Alice's transmitted photons via a partial Bellstate measurement (BSM) depicted by the dashed region. The outcome of the BSM, c, signifies whether the required heralding pattern, c = , has occurred, corresponding to two detector clicks that represent simultaneous detection of orthogonal polarizations. Provided that the BS transmittance is kept close to one (T ≈ 1), occurs only when exactly one photon is transmitted by the BS while the other photon is reflected, and the single photon-pair term of the state produced in the SPDC by Alice reaches the BSM. In this manner, the photons distributed to Alice and Bob are prepared with orthogonal polarizations, although the information about their concrete polarization is erased by the partial BSM. The resulting state shared by Alice and Bob conditioned on corresponds to a partially (polarization-) entangled two-qubit state with asymmetry dictated by the BSM transmittance parameter t (in an unnormalised form): wherep parametrises the probability to produce multiple pairs in the SPDC process of Alice (see App. D). The target state |ψ t Fig. 3(b) requires both Alice and Bob to produce two single photons with orthogonal polarizations H and V , inspired by the entanglement distribution scheme of Lasota et al. [59]. The photons produced on each side impinge separate BSs of low transmittance (T ≈ 0) and, thus, reach the central station with low probability. The heralding is again provided by a partial BSM, performed now by a third party, after passing both incoming beams through separate HWPs. The signal is observed only when each party transmits exactly one single photon and in such a case the reflected photons kept by Alice and Bob are again in a partially (polarization-) entangled state with asymmetry determined by the transmittance t of the partial BSM performed at the central station (see App. D): Unlike previous proposals, see Eq.
(3), in the above two schemes the vacuum terms do not emerge after heralding. Moreover, the unnormalised states (5) and (6), to first significant order, are pure and proportional to the transmission efficiency η t . This guarantees that, after normalisation, the states are independent of η t (to first order). This, and the use of single-photon sources instead of SPDC, are the crucial ingredients that allow us to achieve significantly higher secret key rates at larger distances than previous proposals.
The second advantage of our schemes is that, by adjusting the transmittance t of the partial BSM, the entanglement of the target state |ψ t AB can be continuously tuned between the maximally entangled (t = 0) and product (t = 1) extremes [13]. This can then be used, in particular, to improve the local ef-  Table 1: Performance of DIQKD schemes. Critical local efficiencies, η * l , only above which the secret key can be distributed in a fully device-independent fashion, compared with ones above which the shared correlations exhibit nonlocality. For perfect local efficiencies (η l = 1), robustness to mixing the joint probability distribution with a maximally uncorrelated one is listed, as well as the bit fraction of the secret key generated per successfully heralded round, equal to one in the ideal case. The probability of producing a single photon or an SPDC photon-pair is assumed as p =p = 10 −4 for each source.
ficiencies, η l , required to meet the security requirements.
We notice that it is possible to reduce the number of single photon sources in our schemes by using a single source emitting a temporal stream of photons. This would require, however, the stream to be demultiplexed either by active optics (which would add extra noise) or probabilistically by passive elements (which would decrease the final heralding rate).

Computing key rates in heralded schemes
In standard DIQKD protocols, Alice and Bob measure their particles. A subset of these measurements is publicly announced so that the users can count how many times different outcomes (a, b) are obtained for the different combinations of inputs (x, y). From this information, they compute the amount of achievable secret key and, if positive, distil it by means of classical post-processing [60] from the remainder of data being shared, specifically, from particular predesigned measurement settings (x * , y * ) [2][3][4]. As mentioned, in the asymptotic limit of infinitely many rounds, the attainable key rate is given by Eq. (2), which constitutes the best known lower bound, that, crucially, is valid for the most general eavesdropping attacks [7].
Calculating exactly H(A|E) in Eq. (2), for a given observed correlation p or Bell inequality violation, optimising over attacks of Eve, turns out to be extremely hard. The problem has only been solved for the CHSH inequality [2], the simplest of all Bell inequalities. Here, we use a lower bound on H(A|E), which in turn provides a lower bound on the key rate, computable for any type of correlations. It is obtained by replacing the von Neumann entropy in Eq. (2) by the min-entropy [4]. This quantity is then directly connected to the guessing probability, G p (x * ), for Eve to correctly guess Alice's output when she performs the measurement x * . It can be computed for any Bell correlations exhibited by p by means of semi-definite programming, as explained in App. E. The resulting bound on the key rate (2), which has already appeared in previous security proofs [4,5], reads In an ideal scenario with two outcomes, there are no errors between Alice and Bob, H(x * |y * ) = 0, and Eve has no information about Alice's outputs, G p (x * ) = 1/2, so that R = R ↓ = 1. Because of its ease of computation, R ↓ is the quantity used here to estimate attainable key rates of the implementations proposed.
When considering protocols that incorporate a heralding stage depending on an outcome c, with rounds occurring at a repetition rate ν rep , we quantify the effective key rate of secret bits certified per time-unit as: where P (c = ) is the probability of successful heralding in each round.

Performance of the SH and CH schemes
As a result, we may quantify the optimal DIQKDperformance of the CH and SH schemes depicted in Fig. 3, by conducting an unconstrained nonlinear maximisation of K in Eq. (8) over all adjustable parameters. In particular, for each of the schemes, we optimise over the source parameters p andp, transmittance values T and t, as well as polarization angles specifying the user measurements. Still, we ensure p,p ≈ 0 and T ≈ 1 (or T ≈ 0) in case of the SH (or CH) scheme, so that the distributed quantum states can be truncated at a finite order in p,p and 1 − T (or T ). Nonetheless, in order to maintain security we bypass such a truncation by giving full control to the eavesdropper over the higher-order terms that are neglected. Moreover, the critical noise parameters can then also be determined by similar optimisation procedures-conducted while increasing the noise until the key rate (8) cannot be made strictly positive. Explicit details about these optimization steps are given in Apps. E, F and G.
We summarise the performance of the SH and CH schemes in Table 1. Although the SH scheme is simpler, requiring only two single-photon sources, its performance is worse than the CH scheme for all figures of merit considered. From here onwards, we thus focus on the CH scheme, which defines the ultimate experimental requirements for DIQKD to be possible within our approach. In what follows, we show that this scheme offers reasonable levels of robustness against all relevant noise parameters.
The resistance to noise is estimated using a simple noise model, in which the ideal correlations are mixed with white-noise correlations with weight 1 − v and v, and perfect local efficiency (η l = 1) is assumed. The CH scheme yields nonlocal correlations up to v = 35.7% level of mixing. Concerning imperfection of the single-photon sources, for a realistic value of multi-photon generation of p = 10 −4 (c.f. [32]) the CH scheme generates up to 0.95 secret bits per (successfully) heralded round-achieving close to the ultimate limit of 1 secret bit, applicable in a perfectly noiseless scenario [61]. The critical local efficiencies, η * l for the nonlocality to be observed are very close to the ultimate bound of Eberhard [13], η l ≥ 66.(6)%, which can be approached due to the ability to prepare pure partially entangled two-qubit states within both the SH and CH schemes.
Most importantly, employing the CH scheme for DIQKD, our work predicts that positive key rates can be generated independently of the separation between Alice and Bob, as long as the effective local efficiency, η l , for each of the user labs is higher than 94.3%. Assuming η l to be the product of the efficiencies of: photon extraction from each single-photon source employed (η ls ), transmission between the sources and detectors involved (η lt ), and detection (η ld ); fully secure DIQKD is possible as long as η l = η ls η lt η ld ≥ 0.943 can be attained by each user.
Taking, for instance, η l = 95% (see also Fig. 4), the secret key can be securely distributed over large distances while completely avoiding the transmission losses. In particular, assuming in Fig. 4 for the CH scheme: realistic η t = e −L/Latt with L att = 22 km [36], the lab beam-splitters to exhibit transmission T ≈ 10 −3 , and each of the sources employed to be producing photons at 100 MHz rate with p = 10 −4 [32]; a key rate of 1 bit/s can be attained over approximately 50 km. In Fig. 4, we consider also the SH scheme, for which we then assume the SPDC source to produce entangled photons also at 100 MHz rate withp ≈ 10 −4 -both theoretically within current technological reach [23,24,62]-but also ensure T ≈ 1 − 10 −3 to make the comparison fair.
However, let us note that the corresponding values of key-rates are primarily dictated by the factor K ∝ ν rep P (c = ) appearing in Eq. (8) with the successful-heralding probability effectively equal to T 2 andp (1 − T ) at L = 0 for the CH and : DIQKD key rates attained with 95% (blue) and 96% (red) local efficiencies. In each case, the solid (dashed) curve represents the key rate in bits per second attained by the CH (SH) scheme. Each key rate is optimised over all adjustable physical parameters, yet in the case of the singlephoton sources impurity parameter, p, its lowest possible value is always favoured. Here, we fix p = 10 −4 , and consider the repetition rate of photon extraction for each source to be 100 MHz [32].
the SH scheme, respectively. In particular, as for the single-photon sources we take p > 0 to account solely for multi-photon events, its impact on the key rates is negligible. Although in our analysis we were motivated to use the least number of single-photon sources, under such an assumption the limiting dependence of key rates on how well T ≈ 0 (or T ≈ 1) could, in principle, be avoided by creating more photons and performing within each lab extra (local) preheralding [55,56], which must importantly assure a polarisation-entangled photon pair to be distributed. When employing multiple SPDC sources such an approach may seem to be even less efficient [63][64][65], however, rapid development of solid-state emitters capable of producing on-demand entangled photons could provide a breakthrough [66][67][68].

Conclusions and outlook
Two proposals for photonic implementations of DIQKD schemes have been given here. They make use of side-or central-heralding, and utilise two or four single-photon sources, respectively. They are capable of maintaining security despite arbitrary transmission losses, and distribute keys over large distances given sufficiently high local efficiencies.
The analysis proves the proposed photonic architectures to be almost optimal from the implementation perspective, as they allow to nearly perfectly compensate for the impact of finite transmission, so that devices can operate independently of the distance separating the users. In contrast, as shown to be generally demanded within the context of DIQKD, being a feature of current state-of-the-art security proofs [7], the requirements on local efficiencies for the protocols remain to be stringent. Hence, an important question that remains open is whether these demanding requirements can be improved by developing more elaborate proofs that, in particular, allow the users to perform two-way communication during the protocol rather than only one-way error correction that is typically assumed [2][3][4][5][6][7]. Unfortunately, recent progress in this direction has indicated that not much room for improvement may be available in this respect without jeopardising the full security [69,70].
Another important future direction is to improve the key-rate analysis presented here, while accounting in more detail for the limitations of particular photonic components being employed [71]. On the one hand, an explicit study of the impact of detector dark counts would be valuable, even though our noise-robustness analysis suggests these not to play a major role (see the values presented in Table 1). On the other hand, the protocol repetition-rates have been assumed to be primarily dictated by the capabilities of the photon sources employed [32,62], while ignoring, e.g. the finite dead-time of the binary detectors.
Nonetheless, while the requirements on local efficiencies for the proposed protocols are currently challenging, based on the rapid technological improvement and anticipated capabilities of single-photon sources [46][47][48] and detectors [49][50][51][52], we hope that the demands on fully secure DIQKD implementations presented already here will be fulfilled in the future.
Note Added. After making this work publically available online at arXiv:1803.07089 [quant-ph], an alike study of CH and SH schemes for DIQKD has been released [72], which by focusing on the Bell violation of the CHSH inequality arrives at slightly higher requirements for local efficiencies but goes beyond the asymptotic key-rate analysis-see also a very recent work [73] that develops finite-key analysis for DIQKD. Moreover, the model of the CH scheme has been developed and explicitly verified against an experimental implementation within the scenario in which Alice and Bob possess an SPDC source of entangled photons each (rather than two single-photon sources) [74]. Although the theoretical predictions have been demonstrated to accurately reproduce the observed correlations, these currently do not exhibit Bell violations strong enough for DIQKD, as the corresponding local efficiencies do not yet reach the stringent regime of η l 95% indicated by our work.

Acknowledgments
We thank Rotem Arnom-Friedman, Mikołaj Lasota, Stefano Pironio and Nicolas Sangouard for helpful discussions. This work was supported by the ERC CoG QITBOX and AdG CERQUTE, Spanish MINECO (Severo Ochoa SEV-2015-0522), Fundacio Cellex and Mir-Puig, the AXA Chair in Quantum Information Science, the Generalitat de Catalunya (SGR1381 and CERCA Program), the Royal Society (URF UHQT), the EU Quantum Flagship projects QRANGE and CiviQ, as well as by the Foundation for Polish Science under the "Quantum Optical Technologies" project carried out within the International Research Agendas programme co-financed by the European Union under the European Regional Development Fund.

A States produced by the SPDC and single-photon sources
The process of spontaneous parametric downconversion [33] (SPDC) producing two-mode polarisation entangled photons is described by the Hamil- and b † V are the bosonic creation operators of the two spatial modes a and b, with H and V denoting their orthogonal polarizations. RewritingĤ with help of the su(1, 1) algebra generators, i.e. ones that obey [L − , L + ] = 2L 0 and [L 0 , L ± ] = ±L ± , it is straightforward to verify that the state produced via the SPDC reads [75]: where denotes the vacuum state of all modes, while τ = κ t > 0 can be assumed to be real.
Moreover, as throughout this work we consider photonic schemes based on (binary, on/off) photodetection, the state Ψ SPDC should be interpreted as an incoherent mixture of different photon-number states due to lack of a global phase reference. Hence, defining q = tanh 2 τ as the effective parameter of the SPDC process, one arrives at the expression: (11) where |Ψ n = 1 n! √ n+1 L n + |0 is the pure state obtained when n photon-pair excitations occur during the down-conversion.
Nonetheless, for simplicity and the purpose of our work, we redefine the state (11) in an unnormalised fashion as, such that Tr[ρ SPDC ] = 1/(1 − 2p) 2 , and so that the parameterp = 2q can now be directly associated with the contribution of the desired singlet: Experimentally, the parameterp is kept small (below 10 −2 ) and may be adjusted with squeezing techniques [23,24]. Although large values ofp increase the production rate of the target maximally entangled two-photon states, |Ψ 1 , they also increase the relative contribution of spurious higher-order terms, |Ψ n>1 , to the SPDC process.
On the other hand, as stated in the main text, whenever the single-photon (SP) sources [25] are used, we represent states they produce in an analogous unnormalised, Tr[σ SP ] = 1/(1 − p), manner as: where the desired single-photon is then produced at the zeroth order in p (≈ 10 −4 in current experiments [32])-in contrast to the SPDC process (12) in which the target photon-pair (15) occurs at the first order inp in Eq. (13). Finally, let us emphasise that throughout this work we perform calculations for all the schemes beyond their expected ideal working-order inp and p, i.e. by performing truncations of (13) and (17) at higher orders. Still, it is crucial to mention that, when we compute the results (key rates and figures of merit presented in Table 1 of the main text), we nevertheless bypass such a truncation by assuming that higher-order terms (those which were dropped) are controlled by the eavesdropper to her own benefit. We give the details of this technique in App. F below.
B Heralded state produced by the qubit amplifier of Gisin et al. [36] The original scheme of Ref. [36] is of the SH type (see Fig. 2(a) of the main text) and consists of an SPDC source held by Alice and two single-photon sources (emitting photons in H and V polarisation modes) held by Bob. For the sake of the argument, let us assume that all the sources do not produce multiple pairs, which is only beneficial for the scheme.
The initial composite state of Alice and Bob before communication and amplification reads [36]: (18) Bob's photons enter a beam-splitter of transmittance T , so that the reflected mode can then be combined with the mode received from Alice within an implementation of the Bell-state measurement (BSM). As a consequence, the final unnormalized state that is shared by Alice and Bob, conditioned on the (heralding) success of the BSM performed by Bob, reads: where we have already ignored all irrelevant terms that do not yield any correlations apart from the vacuum-which occurs with probability proportional to (1 − T ) 2 , since both of Bob's photons are reflected and detected. The second term in Eq. (19) corresponds to the case when Alice produces the singlet (15), which is transmitted with probability η t , and only one of Bob's photons is reflected. One can see that Eq. (19) is of the form of Eq. (3) in the main text with the effective λ = T /(1 − T ).
Such a feature will always emerge as long as the singlet (target) state is proportional to η t , while the vacuum component remains unaffected by the finite transmission efficiency. In particular, it naturally generalises to scenarios based on 'entanglementswapping' or 'teleportation' [41][42][43] and hence, as explained in the main text, constitutes the main limitation of all these schemes. The only exception is the 'quantum-relay'-based scheme proposed in Ref. [41] that, however, due to SPDC sources being employed yields a conditional state still containing undesired terms in apart from the singlet contribution in Eq. (5).

C Secret-key rate under losses
In this appendix, we present a rather natural scenario in which the bound on the key rate derived by Gisin et al. [36] can be proven not to hold. In the supplemental material of Ref. [36], the situation is studied in which Alice and Bob implement lossy measurements on an entangled state. The goal is to establish an upper bound on the information that an eavesdropper can possess about the outcomes used for generation of the secret key, given that the non-detected events have already been discarded.
A bound based only upon the statistics of the conclusive events is not possible, as it would open the detection loophole. In Ref. [36] a method is given for bounding Eve's knowledge about the conclusive correlations, based upon the full (lossy) correlations. The main result, see Eq. (10) in their work, is the following bound on the mutual information, I(A : E) = H(A) − H(A|E), between Alice and Eve: Here, µ is a parameter defined by the ratio of the rates of conclusive-conclusive events, µ cc , and conclusiveinconclusive events, µ ci and µ ic , from Alice's and Bob's perspective, respectively, and reads The parameter S cc , on the other hand, denotes the value of the CHSH inequality when computed only from the conclusive events. Finally, the function has already been employed in the main text, see below Eq. (4), and follows from Ref. [2]. For what follows, the property to remember is that χ(S) < 1 if S > 2. We also emphasise that the bound (20) depends on the lossy correlations: while the Bell parameter used in I E in Eq. (20) is estimated only from the conclusive events, I E depends also on the rates of conclusive and inconclusive events via the parameter µ.
Let's apply the bound (20) to a situation in which losses only appear on Alice's side. The corresponding correlations, p η l , between Alice and Bob then read analogously to Eq. (1) of the main text: where again a and b refer only to conclusive 'detection' events, while P A and P B denote the marginal probabilities detected by Alice and Bob (in the ideal lossless case). We consider the standard situation in which Alice and Bob implement the optimal measurements to violate the CHSH inequality, given a singlet is shared, while the key is generated from one of these measurements, say x = 0. Therefore, Eve's goal is to guess the output of this measurement on Alice's side. The correlations (23) can be conveniently arranged in a table as follows where within each of the four blocks the columns are labelled by the two possible conclusive outputs of Bob, and the rows by the three outputs of Alice that include the non-detected outcome. The four blocks above correspond then to the four combinations of measurement settings x, y = {0, 1}, where s = (1+cos(π/4))/4 and t = (1−cos(π/4))/4. Whenever the local efficiency is unity, η l = 1, the correlations (24) violate maximally the CHSH inequality and are referred to as the 'Tsirelson correlations'. It can be verified that the correlations (24) are local whenever η l ≤ 1/ √ 2. On the other hand, when using them to evaluate the bound (20) for any of Alice's measurement, say x = 0, one obtains I E (S cc , µ) < 1 whenever η l > 1/ √ 2. The result is intuitively satisfactory: if the initial correlations are non-local, there is some uncertainty left for Eve about Alice's outcome after discarding the non-conclusive events. Unfortunately, this conclusion, and therefore the bound (20) used to derive it, is not universally valid, as proven by the following attack of Eve, whereby she has perfect knowledge of Alice's outcome for some values of η l larger than 1/ √ 2. Eve prepares a mixture of the following three distributions: , which is local and, hence, can be further decomposed in terms of deterministic strategies; and

s t s s t s t s t t s t s s t
The correlations (26) constitute ideal Tsirelson correlations (Eq. (24) with η l = 1 and re-labelled outcomes for Alice) and, therefore, are as non-local as the quantum mechanics allows. The important fact to notice is that for all p 1 , p 2 and p 3 , if the no-click events are discarded, Eve has perfect knowledge on Alice's outcomes for x = 0, as can be seen by inspection from the tables.
We consider the following mixture and require the local efficiency to be outcomeindependent. In particular, in order to reproduce the correlations (24), we solve for λ, so that p λ = p η l . As a result, for λ = 1 we obtain an attack for which the correlations (24) are recovered with local efficiency: As, once the non-conclusive events are discarded, Eve can then predict with certainty Alice's outcome for the setting x = 0, the above attack invalidates the upper bound (20) that predicts this to be impossible for any η l > 1/ √ 2 ≈ 0.707. Although the bound (20) cannot thus hold in complete generality for situations including losses (as already speculated in Ref. [41]), it still remains to be proven whether Eq. (20) can be considered to be valid for the specific correlations arising in the protocol of Ref. [36].

D Heralded states produced by the SH and CH schemes
In Fig. 5 we depict once more the SH scheme-see its implementation in Fig. 3(a)-while separating explicitly the photonic modes involved, i.e. distinct modes originating from the labs of Alice and Bob (A and B) distinguished also by their polarisations (H and V ), as well as the auxiliary modes (labelled as primed " ") that effectively are the ones to reach the heralding station, and are measured to obtain the heralding signal c. A protocol round is then accepted within the SH (and also CH, see below) scheme only if the click pattern c = 0110 =: is observed with only the two middle detectors in Fig. 5 clicking.
Within the SH scheme Alice uses the SPDC process to produce a pair of entangled photons in modes A H/V and A H/V described by the state (12), ρ SPDC . Bob employs on-demand sources in order to simultaneously prepare single photons (SPs) in modes B H and B V , each described by the state (16), σ SP , see Fig. 5. Inspecting the expressions (12) and (16), the SH scheme ideally works at first order inp and zeroth order in p, respectively, with higher orders being negligible due top 10 −2 [23,24] and p 10 −4 [32]. For completeness, however, we perform the analysis up to second order in both p andp by considering the initial (unnormalised and uncorrelated) state of Alice and Bob-i.e. the overall one present initially in the modes A H/V , A H/V and B H/V in Fig. 5-to read: with higher-order terms yielding negligible contributions, which nonetheless must be later accounted for (see App. F) when assuring the security of the DIQKD protocol. Inspecting Eq. (30), it is the seventh term occurring atp-order which is the desired one, containing an entangled pair |Ψ 1 in modes A H/V and A H/V and single photons in both modes B H/V . All other terms are spurious: the first six are associated with the vacuum production rounds of the SPDC source held by Alice; the eighth and ninth correspond to cases in which the SPDC process succeeds but one of the SPs emits two photons instead; while the last term appears due to double-pair production of the SPDC. In order to compute the state ρ (SH) AB|c marked in Fig. 5, we propagate the initial state (30) "term by term" through the relevant parts of the circuit and account for the photon-detection measurement in modes A H/V and B H/V , while assuming the detectors to be binary (on/off), i.e. clicking with efficiency η h , without distinguishing the exact photon number.
Although we omit here the explicit expression for ρ (SH) AB|c that we obtain for c = (i.e. when only two out of the four relevant detectors in Fig. 5 click), we note that the leading order of ρ

AB|
, because we ensure that T ≈ 1 within the SH scheme. As a result, the SPs produced by Bob hardly enter the modes B H/V in Fig. 5 or, in other words, leave the lab of Bob in Fig. 3(a).
Importantly, we use the full expression for ρ incorporating all the contributions of Eq. (30) to compute the resulting correlations shared by Alice and Bob after they measure photons in modes A H/V and B H/V in Fig. 5, respectively, i.e.: which similarly to the initial state (30) is valid up to O p ipj i+j=3 . The form of the joint probability distribution (31) depends strongly on the measurement settings controlled by the angles (φ, θ) of polarization dual-rail qubits [76] detected by Alice and Bob, the efficiency η d of the binary detectors they employ, as well as the t-parameter controlling the asymmetry of the heralding BSM (see Fig. 5) and, hence, the partial entanglement of the target state |ψ t AB in Eq. (5). Nonetheless, we also list in the second row in Eq. (31) all the other parameters that the shared correlations formally depend on due to higher-order terms taken into account within the initial state (30).
However, in practice-as verified also by our numerical analysis-the dependence on the transmission loss parameter, η t , as well as the efficiency of heralding detectors, η h , can be completely disregarded as they enter Eq. (31) at higher order in p and 1−T , respectively. Nonetheless, let us emphasise that to compute both the critical local efficiencies, η * l , stated in Table 1 and the DIQKD key rates presented in Fig. 4, we use the full expression for the joint prob-  Fig. 3(a) with all the photonic modes separated. Alice employs a SPDC source to prepare entangled photon pairs in modes A H/V and A H/V described by the state ρ SPDC in Eq. (12). Bob uses two SP sources instead to simultaneously prepare single photons in modes B H/V , each described by the state σ SP in Eq. (16). The whole SH scheme corresponds to a linear-optics circuit involving beamsplitters (BSs), half-and quarter-wave plates ( λ 4 and λ 4 ), polarising beamsplitters (PBSs), and binary detectors yielding "0" (for no photons) or "1" (when one or more photons are detected). The finite efficiency of heralding detectors, as well as ones held by Alice and Bob, is accounted for by loss parameters η h and η d , respectively, which similarly to the transmission loss, ηt, correspond to a BS-transformation with the vacuum state impinging the empty input port. In our analysis we consider the overall initial state ρ (SH) AB to be adequately described by its lowest-order expansion (30). We propagate it then through the circuit in order to compute the resulting state shared by Alice and Bob conditioned on successful heralding outcome, i.e. ρ

(SH)
AB|c with c = 0110 =: when only the middle two of the heralding detectors click. ρ AB| is then the state spanning modes A H/V and B H/V that Alice and Bob perform dual-rail polarisation qubit measurements on, whose settings are completely parametrised by the angles x = {φ A , θ A } and y = {φ B , θ B } [76]. Finally, the outcomes a and b correspond to the four possible click patterns observed by Alice and Bob, respectively. ability (31). In particular, we set the efficiency of the heralding detectors to be equal to the ones of Alice and Bob, i.e. η h = η d =: η l , which in practice affects then only the key rate with P (c = ) ∝ η 2 h in Eq. (8). Moreover, in order to determine the highest key rates K in Eq. (8) that yield the lowest critical local efficiency, η * l = η * d , we also fine-tune the source parameters {p, p, T }, whose orders of magnitude we importantly constrain top ≈ 10 −2 , p ≈ 10 −4 and (1 − T ) ≈ 10 −3 for the lowest-order expansion analysis to always be valid.
For the CH scheme depicted Fig. 3(b), we follow exactly the same analysis as stated above for the SH scheme. The CH scheme can be presented as a similar linear optics circuit, where now the A-modes constitute just a copy (mirror image) of the B-modes drawn in Fig. 5. Within the CH scheme both Alice and Bob possess two on-demand SP sources. The only difference-due to the heralding station in Fig. 3(b) being held outside of the labs-are the transmission losses, η t , that must now be accounted for not only in the A H/V (see Fig. 5) but also in the B H/V modes.
However, for the CH scheme the initial state prepared by Alice and Bob (this time using only the four modes A H/V and B H/V in Fig. 5 with others containing vacuum) no longer contains spurious vacuum contributions, due to the SPDC process being absent, i.e.: where, in contrast to Eq. (32), the ideal contribution occurs at zeroth order in p, that is, when each of the four SPs produces a single photon. Still, similarly to the SH scheme, we include higher-order contributions (now, at first order in p) in our analysis.
In particular, we compute the corresponding state ρ (CH) AB| conditioned on successful heralding, whose main contribution comes from the zeroth-order in Eq. (32) stated in Eq. (6) of the main text. As in Eq. (31), while keeping all the contributions of the initial state (32), we compute the shared correlations of Alice and Bob after they perform their measurements, i.e.: where in contrast to Eq. (31) the efficiency of the heralding detectors η h -appearing in Eq. (33) again only due to higher-order terms in Eq. (32)-can be interpreted as just another source of effective transmission loss,η t . As a result, in order for comparison of the key rates in Fig. 4 between the SH and CH schemes to be fair, we rescale η t → η t η l in case of the latter to account for the heralding detectors to have the same efficiency as the ones held by Alice and Bob (i.e., η h = η d = η l in Fig. 5). Otherwise, we perform exactly the same analysis for the joint distribution (33) as for Eq. (31), in order to determine the maximal key rate, K in Eq. (4), and critical local efficiencies, η * l in Table 1, where we ensure now that T ≈ 10 −3 and p 10 −3 throughout the numerical optimisation, so that our perturbative approach (in photon number) assumed by Eq. (32) always holds.
Finally, let us note that within both the SH and CH schemes we may naturally account for the finite efficiency of the on-demand sources employed [25], which produce the SPs in the state (16), i.e. σ SP marked in Fig. 5 for the SH scheme. Inspecting Fig. 5 and, in particular, modes B H/V -and similarly for the A H/V modes in case of the CH scheme, in which they are equivalent-it becomes clear that one may propagate beam-splitters responsible for the finite detection, η d , all the way through the circuit onto the initial state without altering the scheme on the whole. Hence, given that each SP-source works with η s -efficiency, all our analysis applies with now simply the overall local efficiency reading η l = η s η d , so that it accounts for the finite efficiency of both the sources and detectors contained within the lab of Alice or Bob, or both (as summarised in the main text while including also finite transmission between these components, η lt ).

E Guessing probability
The min-entropy term − log 2 G p (x * ) in Eq. (7) of the main text is expressed with help of the device-independent guessing probability, i.e. the average probability that the eavesdropper Eve correctly guesses the output of Alice using an optimal strategy: [77]  Here, P (e) denotes the probability that Eve observes the outcome e, while P (a = e|x * , e) effectively repre-sents the probability that Alice obtains an outcome a coinciding with e, given to be the one observed by Eve. Any strategy of Eve in Eq. (34) can be seen as a measurement that she performs on her system, which then produces a decomposition (a collection) of unnormalized behaviours {p e } distributed between Alice and Bob. The guessing probability (34) is then obtained by maximising the success of Eve's strategy over all such possible decompositions that, however, must reproduce on average the behaviour p observed by Alice and Bob and be compatible with quantum mechanics (see the second line of Eq. (34)). Formally, each of them must belong to the set of unnormalised behaviours Q which stem from the Born's rule when valid quantum measurements act on an unnormalized, yet unspecified, quantum state. Thus, to enforce the quantumness of Eve's strategy, the second constraint in Eq. (34) demands that all p e belong to Q.
Imposing membership in Q is difficult since a precise characterization of Q is unknown. However, semi-definite programming (SDP) relaxations similar to the ones presented by Navascués et al. [78] can be introduced to bound G p (x * ) from above [77]. One defines a convergent hierarchy of convex sets that have a precise characterization and obey Q 1 ⊇ Q 2 ⊇ ... ⊇ Q. This hierarchy approximates the quantum set Q from outside, so that any optimisation over the quantum set can be relaxed (to some order k) by replacing Q in Eq. (34) with Q k . Hence, the program presented in Eq. (34) becomes an SDP when relaxations of the set Q are employed-in our work we mostly consider relaxations to the order 1 + AB, i.e, an intermediary order between first and second orders.
Finally, let us note that from the dual formulation [79] of the SDP program employed, we are also always able to retrieve the Bell inequality that is optimal for bounding the degree of predictability that a quantum eavesdropper may have about the string of Alice's outcomes [77].

F Dealing with higher-order multiphoton contributions
In order to deal with quantum states produced by SPDC and single-photon sources (presented in App. D), one typically truncates the global state produced by all sources in the setup up to a certain order n [36,41,42]. Since any setup we consider is powered by SPDC sources parameterized byp and singlephoton sources parametrized by p, a truncation to the order, e.g. n = 2 of the global state-which is the tensor product of the states of each source-would keep all terms up to order O p 2 , O(pp) and O p 2 .
Nevertheless, this perturbative approximation may yield misleading conclusions about the nonlocal char-acter of the observed correlations and compromise DIQKD security for a given setup. In fact, one has to guarantee that contributions not considered in the truncation will not contradict the conclusions about the nonlocal character of the behaviour in question.
To avoid this problem, we develop here a method based on SDP techniques where all high-order contributions (> n) that are not taken into account are fully controlled by Eve, to her benefit. This may seem too conservative, but the method turns out to be efficient and not overly pessimistic, since the contribution of high-order terms becomes irrelevant for sufficiently low values of p andp.
The key idea is to conceive higher-order contributions as producing an unknown and uncharacterized quantum behaviour p Q prepared by Eve for Alice and Bob. If p est n denotes the estimation of the behaviour of Alice and Bob constructed to the order n (e.g. one derived basing on states (30) or (32) for SH and CH-schemes, respectively), then the first step of the method is to write the observed behaviour p as a convex decomposition: p = (1 − n )p est n + n p Q . At the quantum level, the total state being shared, given a collection of sources producing a perturbative state such as (12), may be written as a convex mixture p(n)ρ n + p(n)ρn, where ρ n is the truncated state according to the estimation made at some order n. ρn is thus the remaining "tail" of high-order contributions, and p(n) = 1 − p(n). Moving to the level of probability distributions, linearity of Born's rule with respect to ρ implies that the elements of the observed behaviour p conditioned on the outcome c employed in the heralding stage (see App. D) may be decomposed in a similar fashion, i.e.: P (a, b|c) = p(n|c) P (a, b|c, n) + p(n|c) P (a, b|c,n).
(35) The probabilities P (a, b|c, n) above are then nothing but the elements of the estimated behaviour p est n computed up to the nth order.
We rewrite p(n|c) employing the Bayes rule:

G Noise robustness for nonlocality
We analyze the robustness to white noise of the estimated behaviours p est that our two schemes produce. We determine the maximal value w * of white noise 1 p -a distribution in which all the outcomes are equally likely, independently of the measurement choices-which can be convexly added such that the behaviour (1 − w)p est + w1 p remains nonlocal.
Membership of a probability distribution to the set of local behaviours [9] is an instance of a linear program [79]. Geometrically speaking, the set of local behaviours is a polytope in the space of probability distributions, whose extremal points correspond to particular deterministic strategies {D µ } µ that are sufficient to decompose any local behaviour. In fact, there is a finite number of such deterministic strategies, and the white noise tolerance of p est is given by the solution of the following linear program: The white-noise tolerance threshold, w * , should be interpreted as deviations from the desired correlations at the level of probability distributions. This is the worst-case approach in which the experimental imperfections not accounted for in p est provide Alice and Bob with completely uncorrelated results. The fact that our schemes tolerate high amounts of white noise (see Table 1 of the main text) ensures that our results will not be strongly affected when introducing other sources of noise, not accounted for in the analysis.