Extending the fair sampling assumption using causal diagrams

Discarding undesirable measurement results in Bell experiments opens the detection loophole that prevents a conclusive demonstration of nonlocality. As closing the detection loophole represents a major technical challenge for many practical Bell experiments, it is customary to assume the so-called fair sampling assumption (FSA) that, in its original form, states that the collectively postselected statistics are a fair sample of the ideal statistics. Here, we analyze the FSA from the viewpoint of causal inference: We derive a causal structure that must be present in any causal model that faithfully encapsulates the FSA. This provides an easy, intuitive, and unifying approach that includes different accepted forms of the FSA and underlines what is really assumed when using the FSA. We then show that the FSA can not only be applied in scenarios with non-ideal detectors or transmission losses, but also in ideal experiments where only parts of the correlations are postselected, e.g., when the particles' destinations are in a superposition state. Finally, we demonstrate that the FSA is also applicable in multipartite scenarios that test for (genuine) multipartite nonlocality.


Introduction
Bell nonlocality [1,2] represents one of the central pillars of modern research in quantum foundations and the development of quantum technologies [3]. A widely-used technique in Bell experiments is the discard of events of an incomplete detection such as, e.g., the non-detection of parts of the system due to particle losses. By the selection bias [4], the postselection opens the detection loophole, i.e., the possibility of a local hidden variable (LHV) model to describe the observed correlations even if the postselected correlations violate a Bell inequality [5,6]. The detection loophole is not just conspiratorial: It was used in experiments to create fake violations of Bell inequalities [7,8,9,10].
Ideally, one can close the detection loophole by including the non-detection events [6,11,12,13], or by sharpening the Bell inequalities [14,15,38]. These methods require high detection efficiencies and have recently been implemented in sophisticated Bell experiments that close the detection loophole [16,17,18] (also while simultaneously closing the locality loophole [19,20,21]). However, the required detection efficiencies represent a severe technical challenge for the practicality of Bell experiments. Thus, a widely-used way out is to rely on the fair sampling assumption (FSA) [22,6,23,24] that is commonly known as the assumption that the postselected statistics is a fair representation of (i.e., is identical to) the statistics that would have been observed using perfect detectors and no losses. An alternative form of the FSA is the assumption that the detector settings have no influence on the detection probability of the particles. In the latter case, the postselected statistics need not be identical to the statistics that would be observed with ideal detectors but, nonetheless, the postselection cannot create any fake nonlocal correlations. We emphasize that the FSA does not close the detection loophole, but it rather represents an assumption that restricts the possible LHV descriptions for the measurement data. Due to the high technical requirements of closing the detection loophole (e.g., highly efficient detectors and minimal transmission losses), the FSA is still widely used in Bell experiments [25,26,27,28,29,30,31].
In this work, we analyze the FSA from the viewpoint of causal inference and causal diagrams [4]. In particular, we ask what structure any causal diagram of a Bell experiment must possess to allow for a valid demonstration of nonlocality if the data is collectively postselected. This structure should guarantee that the postselection cannot create fake nonlocal correlations due to the selection bias. Importantly, we ensure that the causal diagram provides a meaning-ful description of the experiment by disallowing any kind of fine-tuning of causal influences [4,32,33], in contrast to previous studies of the FSA. This results in an easy and intuitive way to understand different forms of the FSA found in the literature. Our analysis highlights what is really assumed when using the FSA, and allows us to identify Bell scenarios where no faithful causal explanation of the FSA exists. Furthermore, we show that the such-obtained causaldiagram FSA can be applied to different experiments where the correlations must be postselected even in the ideal noiseless setup, and not just in the standard setup with non-ideal detectors and particle losses. Finally, we prove that the FSA is also useful in experiments that demonstrate multipartite nonlocality and genuine multipartite nonlocality.

Fair sampling in the bipartite scenario
In the following, we derive a necessary causal structure for any faithful causal model of a bipartite Bell scenario that includes a collective postselection, without potentially creating fake correlations that violate the Bell inequality. The central promise is to have a causal description that does not employ any fine-tuning of its causal influences. By a fine-tuning of the parameters of a causal model, two variables can be made statistically independent even though they seem to affect each other in the causal diagram. Instead, without fine-tuning, any statistical independence between two variables must be evident from the diagram's structure. If one allows for finetuning, the description in terms of causal diagrams becomes irrelevant [4,32,33] because any statistical (in)dependence can just be directly imposed by hand. Causal models that are not fine-tuned are usually called faithful models.
The central tool to infer independencies from a causal diagram are the d-separation rules [4] that dictate whether a given path connecting two variables of the causal diagram is open (i.e., the variables are generally dependent) or blocked (the variables must be independent), also when conditioning on other variables of the diagram. The rules state that (i) a path is blocked if along it there is a collider (a variable where two causal arrows collide), (ii) a path is blocked if along it there is a non-collider that is conditioned on, and (iii) a path is open if along it there is a collider and we condition on the collider or its descendants. The last rule manifests itself in the selection bias and the Berkson paradox [4]. We want to note that in the context of Bell experiments, causal diagrams and the d-separation rules have been used to show that quantum violations of Bell inequalities require finetuning in classical-causal explanations (e.g., superdeterminism, superluminal influences, or retrocausal influences) [33,34,35], and to show that certain collective postselection strategies are safe for the demonstration of multipartite nonlocality [36] and genuine multipartite nonlocality [37,38]. In the latter works, it was shown that if the postselection can be decided by excluding some of the parties, the detection loophole can be closed, so one does not have to rely on the FSA. In contrast, here we consider a postselection that must be decided by all parties together, such that the results of Refs. [36,37,38] do not apply. In the standard bipartite Bell scenario, two parties, Alice and Bob, share the two parts of a quantum system and each perform local measurements on their subsystem. Alice (Bob) chooses the measurement setting x (y) and records the measurement outcome a (b). The measured correlations are called local if they can be described by a LHV model of the form [1,2] where p λ , p a|xλ , and p b|yλ are (conditional) probabilities, each summing to 1, e.g., λ p λ = 1. The corresponding causal diagram is shown in Fig. 1(a). Using Eq.
(1), one can derive Bell inequalities, a violation of which proves that the correlations are nonlocal. In Fig. 1(b), we include the variable K representing the decision of the collective postselection (e.g., K = 1 for postselecting the results and K = 0 for discarding the results). If the postselected statistics p ab|xy1 (1 is the value of the variable K) can be described by a LHV similar to Eq. (1), they must also fullfill the Bell inequality and the postselection is valid. By the definition of conditional probability, we can write p ab|xyk = λ p λ|xyk p ab|xyλk , such that we can identify the two conditions CI : that ensure a LHV description of p ab|xyk and thus a valid postselection 1 . We note that CI and CII correspond to the measurement-independence and locality assumptions of Bell's theorem, respectively. Now consider a postselection that is collective: Both parties must consult each other to decide the postselection. Note that if each party can decide the postselection locally, there is no need for a FSA because the postselection is known to be safe [13,36]. Therefore, in the causal diagram, K is influenced by both measurement outcomes A and B 2 , see Fig. 1(b). However, a postselection described by the causal diagram in Fig. 1 The causal diagram of Fig. 1(b) can thus not give a causal account of the fair sampling without employing a fine-tuning of causal influences. In the following, we show that any causal description of the FSA requires a certain type of structure in the causal model if the model is not fine-tuned. We consider a general bipartite Bell scenario where Alice's (Bob's) measurement settings are given by a number of setting variables x = (x 1 , . . . , x n A ) [y = (y 1 , . . . , y n B )], and Alice (Bob) observes a number of measurement Their measurement outcomes are correlated via the 1 It would actually suffice to show that p ab|xyk admits a LHV model only for k = 1, and not necessarily for all k. The causal-inference techniques that we employ below yield the conditions for all k, implying the case k = 1.
2 Strictly speaking, a collective postselection could also include direct influences from the settings X and Y to K, e.g., a postselection influenced only by X and B, or only by X and Y . A postselection influenced by X and B leads to a conflict with condition CI, similar a postselection influenced by A and B as in the main text. On the other side, a postselection that is decided only by X and Y is a safe postselection, i.e., the conditions CI and CII still hold. This can be seen by noting that, if K is influenced only by X and Y , one simply has p ab|xyk = p ab|xy . LHV Λ (note that we group all LHVs into the single LHV Λ without loss of generality). At this point, the different outcome variables of each party can be arbitrarily connected by causal influences, e.g., one could have a causal influence A 1 → A 2 . After performing their measurements, the parties collectively postselect their data, represented as the binary postselection variable K as above. The corresponding causal structure is identical to the one of Fig. 1(b), expect that all setting and outcome variables are replaced by multivariable versions. Thus, similar to above, this general causal model is in conflict with the conditions CI and CII for a valid postselection.
To derive the causal structure required to show conditions CI and CII without fine-tuning, we first divide Alice's outcomes A into the outcomes A (K) that are used to decide the postselection K, and the outcomes A (X) that are not, see Fig. 2(a). Similarly, we divide B into B (K) and B (Y ) . The bidirected arrows depict arbitrary causal influences, i.e., or a hidden variable Γ such that A i ← Γ → A j (a hidden common cause), and combinations of thereof. Note that a hidden common cause can be included in the LHV Λ. Now, if some A j ∈ A (K) is directly influenced by X, there is an open path X → A j ← Λ because we condition on K, a descendant of the collider A j (we assume that the LHV Λ influences all measurement outcomes). Thus, to preserve condition CI, X cannot have a direct influence on the group A (K) , and similarly for Y and B (K) . Next, if there was an in- Here we assume that X influences A i because, otherwise, A i would neither be useful for violating Bell inequalities 3 , nor would it be useful to decide the postselection (because of A i K), so one can just discard the outcome A i from the analysis. We thus obtain Fig. 2(b) which ensures that the condition CI is fulfilled. For instance, for proving that p λ|xk = p λ|k , note that the only path connecting X and Λ passes through A (X) that, being a collider, blocks the path.
Since there is no influence from X to any A j ∈ A (K) , the variables in A (K) (and B (K) ) are not useful to violate a Bell inequality 3 . We thus only consider the variables A (X) and B (Y ) as inputs to the Bell inequality. To show that the postselected statistics (describing correlations between A (X) and B (Y ) ) can be described by a LHV model, it remains to show condition CII, i.e., p a i br|xyλk = p a i |xλk p br|yλk for all A i ∈ A (X) and B r ∈ B (Y ) . If there were in- 3 To violate Bell inequalities, each party's setting must influence its measurement outcome. For instance, in the bipartite scenario with one setting and one measurement variable per party, assume that Alice's setting does not influence her outcome. Due to non-signalling, her setting cannot influence Bob's outcome, so one has p ab|xy = p ab|y = p a|y p b|ay = pap b|ay , where we have used the no-signalling principle p a|y = pa. This yields a LHV model for p ab|xy : Defining Λ to take the same values as A with identical probabilities, p λ = pa for λ = a, one has where δ is the Kronecker symbol.
would be open, in conflict with CII. Excluding influences of the form A (K) → A (X) and B (K) → B (Y ) ensures CII because the only paths that connect A (X) and X to B (Y ) and Y pass though Λ and are blocked because Λ is a non-collider that is conditioned on. Thus, we conclude with Fig. 2(c) which ensures both CI and CII (if restricted to variables in A (X) and B (Y ) ).
In summary, we have started from a general bipartite Bell scenario including a collective postselection and derived a necessary causal structure to faithfully describe the FSA, see Fig. 2(c). This structure requires that each party must have at least one measurement variable (A (K) and B (K) ) that is used to decide the postselection and that is independent of the measurement settings, and at least one measurement variable (A (X) and B (Y ) ) that is used as an input in the Bell inequality and that does not influence the postselection. The smallest realization of this structure is a Bell scenario where each party has a binary measurement setting (X and Y ), a binary measurement variable (A 2 and B 2 ) that dictates the postselection and a binary measurement variable (A 1 and B 1 ) that is used in the Bell inequality. In the standard use of the FSA to deal with the loss of particles, A 2 and B 2 correspond to the number of detected particles in the respective measurement stations, while the A 1 and B 1 correspond to the outcomes of, e.g., a polarization measurement of incoming photons 4 . We note however that, in general, the variables A (K) and B (K) not necessarily represent the number of detected particles, but the results hold for any collectively decided postselection.
We want to emphasize that the causal-diagram FSA is not only applicable to the standard scenario where one particle is sent to each party but there are detection and transmission losses, but also for certain postselection methods if the particles are generated in a superposition of their destinations [13,39,40]. In particular, demonstrations that a coincidence postselection is safe if the number of particles is conserved [36,37,38] become unnecessary if one assumes the FSA. In other words, the FSA covers both 4 In this example, one could wonder about the meaning of A1 if Alice does not detect a particle (A2 = 0). Since this event will be discarded in the postselection, the value attributed to A1 is not important. To be consistent with the assumption of Fig. 2(c) (A2 A1) one could, e.g., flip a coin to set the value of A1 in this case. Figure 3: A possible causal diagram for the FSA if, for the postselection of the results, each party must receive a single particle. Here, A 2 (B 2 ) corresponds to the number of particles detected in Alice's (Bob's) measurement device.
a postselection due to inefficient detectors and transmission losses, and a postselection in ideal experiments due to a varying distribution of particles. Above, we have derived a necessary structure of any bipartite causal diagram that faithfully describes the FSA for a general collective postselection. However, typical applications of the FSA are situations in which each measurement party needs to detect a single particle. Here, the variables A 2 and B 2 correspond to the number of detected particles in Alice's and Bob's experiment, respectively. This is a special case of a collective postselection, in which postselecting an event (denoted as K = 1 above) is equivalent to a fixed combination of values for A 2 and B 2 (A 2 = B 2 = 1). Thus, one can simply use a causal diagram with a conditioning on the variables A 2 and B 2 without introducing the postselection variable K. While influences such as X → A 2 and A 1 → A 2 are still in conflict with condition CI, an influence of the form A 2 → A 1 can now be allowed. The corresponding causal diagram is shown in Fig. 3. The causal diagram of Fig. 2(c) (and its smallest realization) is more general though: An example of a collective postselection that cannot be modeled with Fig. 3 is when the parties postselect events for which A 2 = B 2 . Here, a postselected event does not imply fixed values of A 2 and B 2 .
To conclude this section, we want to mention that, while one cannot experimentally certify that the LHV is causally modelled by Fig. 2(c) (in the same way that one cannot experimentally certify the FSA), our results highlight in which cases is it not possible to have a faithful causal account of the FSA. As discussed above, to account for the FSA, each party must have at least two separate measurement results, one that influences the postselection decision, and one that is used in the Bell inequality. This excludes setups for which the outcome used in the Bell inequality is also used to decide the postselection. An example is a setup proposed by Fran-son [41] to create nonlocality from energy-time entanglement, which has been shown to admit LHV models that reproduce the observed statistics and the apparent Bell inequality violation even in the noiseless case [42,43,44,45]. Here, each party has two different measurement outcomes, an early detection time and a late detection time. Even without particle losses, the statistics must be collectively postselected in order to violate a Bell inequality: Only those events are postselected for which both particles arrive either at the early or at the late detection time.
Since the time of arrival is used both in the postselection and in the Bell inequality, there is no way to introduce two separate variables per party, e.g., A 1 and A 2 , with the roles as above. Thus, a FSA for the noiseless version of the original Franson setup must rely on a fine-tuning in the causal diagram.

Comparison to standard FSAs
We now want to briefly compare the causal-diagram FSA to its different forms found in the literature. We emphasize that any of the following forms of the FSA corresponds to a fine-tuning condition on the original causal diagram of the Bell experiment (Fig. 1). We first comment on the common (mis-)understanding that the FSA means that there is no observable influence of the measurement setting on the probability of detecting a particle, p d|x = p d , where d represents Alice's detection of a particle. As shown in Ref. [23] with a counter example, this assumption does not ensure a safe postselection. The original way of stating the FSA is that the postselected statistics should be a fair sample of (i.e., be identical to) the statistics that would have been obtained using perfect detectors [6]. This assumption is satisfied if p d|xλ = p d , i.e., if the probability of detecting a particle depends neither on the setting X nor on the LHV Λ. This condition ensures a safe postselection but, as we have seen above, can be weakened: Assuming that p d|xλ = p d|λ already provides a safe postselection [23]. Here, the postselected ensemble may differ from the original one, p λ|k p λ , but it still has the form of a LHV model, Eq. (1). Assuming a causal-diagram representation without fine-tuning, the assumption that X cannot influence the detection variable corresponds to the above causal diagrams of Fig. 2(c) or Fig. 3.
Finally, we note that this FSA can be further weakened to the assumption that p d|xλ = η [23,24], i.e., the assumption that the detection efficiency depends on both the setting X and the LHV Λ but it factorizes. This factorization condition cannot be de-picted in a causal diagram and, as the dependence implies that A 2 is influenced by both X and Λ, it represents a fine-tuning of the causal influences.

Fair sampling for genuine multipartite nonlocality
The causal-diagram FSA of Fig. 2(c) (or Fig. 3) is also applicable in multipartite Bell experiments. For more than two measurement parties, there are different notions of nonlocality that can be demonstrated by a violation of the corresponding inequalities [46,47,48,49,50]. For simplicity, we focus on the three-partite case, but the discussion holds for any number of parties. We thus include a third party Charlie who chooses a measurement setting z and observes the measurement outcomes (c 1 , c 2 ), where for simplicity we only consider the smallest realization of the FSA causal structure including two measurement variables per party. First, one can assume a LHV model in the multipartite case similar to Eq. (1), corresponding to a causal structure as shown in Fig. 4(a), where we included the FSA derived above. Using the LHV model, one can demonstrate inequalities that test multipartite nonlocality [47]. The validity of the postselection, namely the conditions p λ|xyzk = p λ|k and p a 1 b 1 c 1 |xyzλk = p a 1 |xλk p b 1 |yλk p c 1 |zλk , can be shown in exact analogy to the bipartite case above.
A second and stronger form of three-partite nonlocality is genuine three-partite nonlocality. Here, instead of assuming a LHV model, one allows for two of the three parties to share nonlocal quantum correlations, in a model that is called a hybrid localnonlocal hidden variable model [46] 5 . These quantum correlations, fulfilling the no-signalling principle [49], cannot be depicted in a classical causal diagram without using fine-tuning conditions [33,34]. In Fig. 4, we indicate these correlations as light blue lines between the outcome variables, reminding that these influences are subject to the no-signalling principle. The hybrid model then dictates that, given a specific hidden variable λ (i.e., when conditioning on Λ), there can only be nonlocal correlations between 5 We note that, recently, several new definitions of (genuine) multipartite nonlocality have been proposed that we do not address in this work. These include network nonlocality [51], broadcasting correlations [52], and nonlocality that is based on the resource theory of local operations and shared randomness (LOSR) [53,54]. See also Ref. [55] for discussion of different classes of multipartite nonlocality. Figure 4: Examples of the FSA causal structure in the three-partite Bell scenario, for (a) a local hidden variable (LHV) model that corresponds to tests of multipartite nonlocality and (b,c) a hybrid local-nonlocal hidden variable model that corresponds to tests of genuine multipartite nonlocality. (c) In the hybrid model, when conditioning on a specific value of the LHV Λ, two of the three parties can share nonlocal quantum correlations (light blue lines) that are subject to the no-signalling fine-tuning condition.
two of the parties, see Fig. 4(c). In contrast, when not conditioning on Λ, there can possibly exist nonlocal correlations between any pair of parties, see Fig. 4(b), where we use different colors to emphasize that only one pair of the parties can share nonlocal correlations at a time.
Hybrid local-nonlocal hidden variable models fulfill certain inequalities that test for genuine multipartite nonlocality [46], and, similar to above, there are conditions on the postselected statistics that, if fulfilled, prove that a collective postselection is safe [37]. One can directly show that, using the causal-diagram FSA as shown in Fig. 4, the conditions of a safe postselection are fulfilled. For instance, for the first condition, p λ|xyzk = p λ|k , we note that a path such as X → A 1 → B 2 → K ← C 2 ← Λ, that appears to be an open path since the collider K is conditioned on, is blocked due to the no-signalling condition: Alice's measurement setting X cannot influence Bob's measurement outcome B 2 . The validity of the second condition of a factorization given a specific value of Λ, e.g., p a 1 b 1 c 1 |xyzλk = p a 1 b 1 |xyλk p c 1 |zλk , see Fig. 4(c), can again be seen by noting that any path that connects C 1 and Z to the other parties passes through Λ, and is thus blocked because Λ is a non-collider that is conditioned on. Thus we see that the FSA depicted in Fig. 4 also suffices to validate a collective postselection for the demonstration of genuine multipartite nonlocality.
Finally, as in the bipartite case, if each party needs to measure a single particle, the FSA can also be explained by a three-partite causal diagram similar to Fig. 3. Here, A 2 represents the number detected particles in Alice's measurement, and one can allow for an influence of the form A 2 → A 1 , and similarly for Bob and Charlie. Also this version of a fair-sampling causal diagram suffices to prove the conditions of a safe postselection for both multipartite nonlocality and genuine multipartite nonlocality.

Conclusions
We have discussed a causal explanation of the fair sampling assumption (FSA) that ensures that a collective postselection cannot create fake nonlocal correlations in Bell experiments. For this purpose, we have derived a causal structure that any causal model of a bipartite Bell scenario must possess to guarantee that the postselected statistics take the form of a local hidden variable (LHV) model, without requiring a fine-tuning of the causal influences. We have employed the framework of causal inference and dseparation rules [4] as a mediator between a causal structure and the implied relations of conditional independence between the variables. Our results clarify what one really assumes if one uses the FSA, and while the corresponding causal structure is not experimentally certifiable (similar to the FSA itself), our results demonstrate that, in certain Bell scenarios, there is no faithful causal account of the FSA. The derived FSA causal structure yields an easy and intuitive explanation of the FSA and can be used to understand different forms of the FSA in the literature. Furthermore, besides standard Bell scenarios with a nonideal detection and particle losses, we have demonstrated that the causal-diagram FSA can also be applied in noise-free scenarios where the statistics must be postselected because the particles are randomly distributed between the parties [39,40,37,38]. Finally, we have shown that the FSA is also applicable for a collective postselection in demonstrations of multipartite nonlocality and genuine multipartite nonlocality.