Quantum violations in the Instrumental scenario and their relations to the Bell scenario

The causal structure of any experiment implies restrictions on the observable correlations between measurement outcomes, which are different for experiments exploiting classical, quantum, or post-quantum resources. In the study of Bell nonlocality, these differences have been explored in great detail for more and more involved causal structures. Here, we go in the opposite direction and identify the simplest causal structure which exhibits a separation between classical, quantum, and post-quantum correlations. It arises in the so-called Instrumental scenario, known from classical causal models. We derive inequalities for this scenario and show that they are closely related to well-known Bell inequalites, such as the Clauser-Horne-Shimony-Holt inequality, which enables us to easily identify their classical, quantum, and post-quantum bounds as well as strategies violating the first two. The relations that we uncover imply that the quantum or post-quantum advantages witnessed by the violation of our Instrumental inequalities are not fundamentally different from those witnessed by the violations of standard inequalities in the usual Bell scenario. However, non-classical tests in the Instrumental scenario are more efficient than their Bell scenario counterpart as regards the number of random input choices involved, which may have potential implications for device-independent protocols.


I. INTRODUCTION
Classical and quantum physics provide fundamentally different predictions about the correlation which can be observed in experiments with multiple parties. Understanding the exact nature of this difference is a central problem in the foundations of quantum physics and is also important for applications in information processing.
In any experiment, the causal structure of the setup imposes restrictions on the observable correlations. Depending on whether the experiment is modeled using classical random variables, quantum states and measurements, or postquantum resources, these limitations may be different, leading to observable differences between classical models, quantum mechanics, and general probabilistic theories. This was first pointed out by Bell [1], who found that models which attempt to describe an experiment in terms of causal relations between classical random variables, and where the actions of one party cannot influence the local observations of separate parties, imply restrictions on the observable correlations, known as Bell inequalities. Measurements on entangled quantum states shared between the observers, on the other hand, can lead to violation of these inequalities.
This discovery sparked the study of Bell nonlocality which by now is an active field of research and a cornerstone of quantum theory [2]. Bell's original setting involves two non-communicating parties each selecting a measurement to perform, and each obtaining a measurement outcome. Later studies have considered many variations of this causal structure. For example, the use of sequential measurements [3], multiple sources [4,5], FIG. 1. DAGs for the standard Bell scenario, and the instrumental scenario. Circles and squares denote observable and unobservable variables respectively. Arrows denote causal influence. (a) Bipartite Bell scenario. Each party has an input (X and Y ) and an output (A and B), which are observable. An unobservable shared source Λ may influence the outputs. (b) Instrumental scenario. The second party has no input, but the output of the first party is communicated to the second. multiple parties [6], and communication between the parties [7,8]. In general, the causal structures of all these variants are more complicated than Bell's original setting. Here, we go in the opposite direction and identify the simplest causal structure that exhibits a separation between classical, quantum, and postquantum correlations.
To be a little more specific about what we mean by 'simple', let us first note how a causal structure can be represented. In a given experiment there is a number of observable variables. For instance, on a measurement apparatus, one variable (the 'input') may correspond to the setting of a knob determining the measurement to be made, and another variable (the 'output') to the measurement outcome. In addition, there may be hidden or latent variables, which are not observed, but which mediate correlations between the observable variables. For instance, setting the knob on an apparatus in one way may determine the reading of a distant apparatus through an unobserved electromagnetic field. We can represent the causal relationships between all these variables on a directed acyclic graph (DAG), where the nodes correspond to variables, and the edges between them signify causal influence. For classical models, all the variables are classical random variables. For quantum or postquantum models, the hidden variables may be replaced by quantum states or more generally by the resources of generalized probabilistic theories (GPT), such as Popescu-Rohrlich boxes [9]. Fig. 1(a) shows the DAG for the standard, bipartite Bell scenario. We consider a causal structure to be simpler the simpler the corresponding DAG is, i.e. the fewer nodes and edges it has.
The causal structure represented by a DAG constrains the possible correlations between the observed variables, depending on whether the hidden variables are classical, quantum or GPT. For instance in the Bell DAG of Fig. 1(a) the correlations between the observed random variables A, B, X, Y are characterized by the conditional probabilities p = {p AB|XY (ab|xy)}, which in the classical case take the form: This is simply the usual Bell locality condition, and it leads to linear constraints on p(ab|xy), which are Bell inequalities 1 .
In the quantum case, where ρ denotes a quantum state produced by Λ and distributed to the quantum devices in A and B; for each x, {E a|x } a is a POVM defining a valid measurement with outcomes a; and for each y, {F b|y } b is a POVM defining a valid measurement with outcomes b.
For the GPT case, where, using the notation of [10], |Ψ) denotes a GPT generalization of the quantum state ρ in (2), and (e a|x |, (e b|y | GPT generalizations of the quantum measurement operators E a|x , E b|y . An example of a GPT beyond quantum theory is the one known as boxworld [11], and the set of such GPT correlations for the Bell scenario coincides with the set of no-signalling correlations. These definitions can be generalized to arbitrary DAGs beyond the Bell scenario. The classical case has been studied extensively in the classical causality literature [12]. Definitions of quantum and GPT correlations for arbitary DAGs were introduced by Henson, Lal, and Pusey (HLP) [13]. We will not present the HLP formalism in detail, as we will not need it, and refer the interested reader to their paper. It suffices to say that when thinking of a set of correlations, be it classical, quantum, or any other GPT, we can think of it as arising from 'measurements' being performed on a 'state', where the measurements and state are dubbed classical, quantum or GPT. Since classical states are a subset of quantum states which are in turn a subset of GPT states, it follows that the sets of correlations associated with the various generalization of a causal structure form a hierarchy, While the classical, quantum, and GPT sets are strictly distinct in the Bell scenario, this is not always the case for an arbitrary DAG. HLP have introduced a necessary condition for these three sets to be distinct. Given a DAG, one can thus evaluate the HLP condition. If this condition is not satisfied, then the sets of classical, quantum, and GPT correlations are equal, i.e., the causal structure represented by the DAG is uninteresting as it does not lead to observable differences between these theories. If the HLP condition is satisfied, then one cannot conclude anything yet: classical, quantum, and GPT models might lead to observable differences, or might not -some further analysis is required.
In their paper, HLP have applied their criterion to all possible DAGs with 7 nodes or less [13], identifying all DAGs that possibly admit a separation between classical, quantum, and postquantum correlations. They have found a single DAG that is simpler than the Bell DAG, where 'simple' means that it involves fewer nodes and edges. This DAG is represented in Fig. 1(b). It has been studied previously in the classical causality literature, where it is known as the 'Instrumental DAG' [14,15], a nomenclature we will follow. We show here that the Instrumental scenario does indeed provide a separation between the sets of classical, quantum, and GPT correlations. We derive an inequality which must hold for classical correlations and relate it to the well known Clauser-Horne-Shimony-Holt (CHSH) Bell inequality [16] for the scenario of Fig. 1(a). In so doing, we identify its maximal quantum and GPT violations. We start by describing in more detail the instrumental scenario and relating it to the Bell scenario.

II. THE INSTRUMENTAL SCENARIO AND ITS RELATION TO THE BELL SCENARIO
Imagine possessing some quantum implementation of the Bell scenario: this consists of three devices. The first device (Alice's apparatus) accepts as input one classical system (Alice's choice of setting) and one quantum system (Alice's share of ρ), and outputs a classical system (Alice's measurements outcome). The second device (Bob's apparatus) again accepts one classical input and one quantum input, and returns one classical output. The third device is Λ itself, which has no inputs, and outputs a bipartite quantum system. Now, consider taking the quantum Bell scenario implementation, and modifying it as follows: Instead of letting Bob choose his setting y freely, copy Alice's classical output a and wire the copy into Bob's classical input. This creates a new scenario, characterized by the conditional probabilities p(ab|x), since y is no longer freely chosen. Of course, p(ab|x) = p(ab|x, y = a) so we could describe the probabilities p(ab|x) that would characterize the new scenario without ever needing to actually perform the hypothetical modification, so long as we have a priori knowledge of p(ab|xy).
But this hypothetical scenario which we have imagined constructing by modifying the quantum Bell scenario is identically the quantum Instrumental scenario per Fig. 1(b). Furthermore, the same is true for the classical and GPT variants of the scenarios, as we can simply take the third device Λ to be a source of either classical or GPT states ρ. This leads us to the following fundamental statement (3) where T is a placeholder for a correlation set, such as classical C, quantum Q, or GPT G.
In this sense, correlations in the Instrumental scenario are essentially postselections of Bell scenario correlations: An experimenter might perform many runs of the Bell scenario experiment, but then postselect to examine only those experimental runs when y=a. This postselected data will exhibit Instrumental scenario statistics, and moreover, every T Instr correlation can arise via this sort of postselection on Bell scenario statistics 2 .
To make this relationship between DAGs more concrete, note that compatibility with the Instrumental scenario is defined nearly identically to compatibility with the Bell scenario, except all references to the variable Y get overwritten with references to the variable A: Even though the Quantum-and GPT-Bell-Scenario sets are in general distinct from the Classical-Bell-Scenario set, it may be that their postselections (defining the corresponding Instrumental-Scenario correlation sets) all coincide. Indeed, it is well known that postselection of genuinely quantum, nonlocal data may admit a purely classical explanation. This effect is at the basis, for instance, of the infamous detection loophole in Bell experiments [2]. It is thus not obvious a priori that the Instrumental scenario should admit a separation between classical, quantum and GPT correlations; it might be another example of an uninteresting DAG, which simply happens not to be identified by the HLP criterion.
The Instrumental scenario can also be understood as a Bell scenario with relaxed causality constraints. Such relaxations of the Bell scenario have been considered previously. For instance, the no-communication assumption between the variables A and B in the Bell scenario has been relaxed in Refs. [8,18], leading to the Signalling-Between-Outputs scenario represented in Fig. 2. Other works have considered modified Bell scenarios wherein one no longer assumes that the measurement inputs X and Y could have been set-up freely [19,20]. The Instrumental scenario represents simultaneous relaxation of both the measurement-freedom and no-communication assumptions: not only may the outcome of the measurement performed at B depend directly on the distant outcome A, but furthermore the measurement setting Y is not chosen freely but is instead fixed entirely by the value A output by the distant measurement device.
We remark that testing for membership in T Instr , by admitting an extension to T Bell per Eq.(3), generalizes to any well defined correlation set in the Bell scenario. For example, one might consider relaxations of the quantum set corresponding to levels of the Navascués-Pironio-Acín (NPA) hierarchy [21,22], including the set known as Almost Quantum Correlations [23], or tests for compatibility with a restricted Hilbert space dimension [24][25][26]. All those correlation membership tests for the Bell scenario can be applied to the Instrumental scenario by simply introducing existential quantifiers: Does an extension of p(ab|x) ≡ p(ab|x, y=a) exist to p(ab|x, y =a) such that the relevant condition for membership in T Bell is satisfied? This modification is especially easy for testing NPA-level compatibility, as those semidefinite tests already natively support unspecified probabilities.

III. GEOMETRY OF THE INSTRUMENTAL-SCENARIO CORRELATIONS
Before attempting to find a gap between classical and quantum correlations in the instrumental scenario, let us take a general geometric perspective to enhance our understanding. Every correlation in the Bell scenario can be thought of as a high-dimensional vector, d = |A||B||X||Y |, where the coordinates are given by the many different probabilities p(ab|xy). Every correlation in the Instrumental scenario can be though as a somewhat lower dimension vector, d = |A||B||X|, where the coordinates are given by the probabilities p(ab|x). The set of all correlations in T Instr are formed by axial projection of those coordinates p(ab|x, y =a) of T Bell .
In both the Bell and Instrumental scenarios, all sets of correlations are convex. The sets G Bell and C Bell are the no-signalling polytope and the local polytope respectively, whereas the set Q Bell is a convex set but not a polytope [2,[27][28][29]. The projections of polytopes are also polytopes, so we know that G Instr and C Instr will also form polytopes. To obtain the Instrumental scenario polytopes from the Bell scenario polytopes, we can use Fourier-Motzkin elimination or any other polytope projection technique [30][31][32][33][34]. Alternatively, we can directly compute C Instr by taking the convex hull of all deterministic strategies in the Instrumental scenario. We have performed these operations for small cardinalities of the observed variables X, A, B using the polytope software PORTA.
In the simplest case where X, A, B ∈ {0, 1} are all binary, we find that C Instr = G Instr and that these sets are fully characterized by the trivial normalization which can be expressed compactly as As C Instr ⊆ Q Instr ⊆ G Instr , the above constraints also fully characterize the quantum set Q Instr .
Since the normalization condition is the only generic equality constraint satisfied by correlations in the Instrumental scenario, the sets C Instr , Q Instr , and G Instr are full-dimensional in the space of normalized probability distributions. This should be contrasted with the Bell scenario where C Bell , Q Bell , and G ell are not full-dimensional in the space of normalized probability distributions, since they also satisfy the no-signaling equality constraints, expressing that the marginal distribution of b cannot depend on x and the marginal distribution of a on y. This full-dimensional property of the Instrumental scenario is not limited to the |X|=|A|=|B|=2 case, but is valid for any cardinalities of the inputs and outputs. Indeed, a method to determine the complete set of equality constraints satisfied by classical correlations compatible with an arbitrary DAG has been given in [35]. Applying it to the Instrumental DAG yields no other equalities than the normalization conditions in the classical case -and hence in the quantum and GPT case as well, since they contain classical correlations as a subset.
Even though the Instrumental scenario does not contain no-signaling constraints -indeed the input b can depend on x through a -some remnant of the no-signaling conditions are preserved when projecting the Bell scenario to the Instrumental scenario, as expressed by the inequalities (4), which can be interpreted as limiting the magnitude by which b can depend on x when a is kept constant.
We can understand that such inequalities are GPTinviolable from the definition of G Instr as as a projection of the no-signalling polytope. As an example, let us derive one of the inequalities (4) from the following two positivity inequalities for the Bell scenario: p AB|XY (11|10) ≥ 0 and p AB|XY (10|00) ≥ 0. Summing those two inequalities together and then using nosignalling to express the probabilities as linear combinations of those where Alice's output matches Bob's input, i.e., p AB|XY (11|10) → p B|Y (1|0) − p AB|XY (01|10) and p AB|XY (10|00) → p B|Y (0|0) − p AB|XY (00|00), we obtain or, equivalently, Having eliminated the non-Instrumental probabilities p(ab|x, y = a), the final inequality (as translated for the Instrumental scenario) reads This proves that eq. (6) -an instance of (4) -is an Instrumental scenario inequality which is GPT inviolable. Expressed in the general form (5), these inequalities are valid for C Instr , Q Instr , and G Instr for arbitrary number of inputs and outputs |X|, |A|, |B|. They were originally derived by Pearl [14] for the classical Instrumental scenario and have come to be known as the instrumental inequalities. Henson, Lal, and Pusey then showed that Pearl's instrumental inequalities (5) are satisfied by all GPTs for arbitrary inputs and outputs [13].
To summarize, we have found that in the case |X|=|A|=|B|=2 the instrumental inequalities (5) are the unique facets, besides the trivial positivity facets, of the GPT polytope. We have verified that this is also the case for |X|=2 and |A|=|B| ≤ 4. We also know that the instrumental inequalities are satisfied by the GPT polytope for arbitrary number of inputs and outputs, but we leave it as on open question whether they are the unique non-trivial facets in this general case.
In the simplest possible Bell scenario where X,Y,A,B ∈ {0, 1} well-known bounds on the violation of the CHSH inequality imply that C Bell Q Bell G Bell . We have found, however, that C Instr = Q Instr = G Instr for the corresponding Instrumental sets, i.e. all non-classical features of Bell correlations are washed out when postselecting them to obtain the Instrumental correlations of the X,A,B ∈ {0, 1} set-up. Though, we have established this fact by fully characterizing the Instrumental polytopes using the software PORTA, it is also instructive to see more explicitly how all non-local correlations of the X,Y,A,B ∈ {0, 1} Bell scenario admit a classical explanation when projected to the Instrumental scenario. Consider for instance the Popescu-Rorhlich (PR) correlations which reach the algebraic value 4 for the CHSH expression.
Since any GPT correlations in the Bell scenario can be written as a mixture of classical correlations and PR correlations, any GPT correlations can be written as a mixture of classical correlations and post-selected PR correlations. But since we have just seen that the later ones are classical, this establishes that G Instr = C Instr . More generally, it was shown in [8,18] by a similar argument that classical models can reproduce any GPT correlations in the Signalling-Between-Outputs scenario whenever |X|=|Y |=|A|=2 and |B| is arbitrary. These results translate to our case since the Instrumental DAG is a subgraph of the Signalling-Between-Outputs DAG in which the node Y is dropped. They imply that there cannot be any separation between classical, quantum and GPT correlations in the Instrumental scenario whenever |X|=|A|=2 and |B| is arbitrary.
Using PORTA, we have also determined the facets of the classical polytope in the case |X|=2, |A|≤4, |B|≤4 by taking the convex hull of all deterministic strategies. We find again that Pearl's instrumental inequalities are the only non-trivial facets, implying that C instr = G instr in this case as well.

IV. A CLASSICAL INSTRUMENTAL SCENARIO INEQUALITY WHICH ADMITS QUANTUM VIOLATION
The case X ∈ {0, 1, 2}, A, B ∈ {0, 1} is more interesting as we found that the facets of the classical polytope C Instr comprise, in addition to Pearl's instrumental inequalities, a new family of inequalities, one representative of which is (7) Inequality (7) was found previously by Bonet [15] (also using PORTA) who was looking for stronger classical constraints to complement Pearl's instrumental inequalities (5).

V. RELATION TO THE CHSH INEQUALITY AND DUMMY INPUTS
The fact that post-selections of the |X|=|Y |=|A|=|B|=2 Bell scenario, where non-locality is entirely detected by the violation of the CHSH inequality, do not lead to non-classical Instrumental correlations might suggest, naively, that violation of Bonet's inequality (7) uncover a stronger form of non-locality, requiring violating beyond the CHSH inequality. We show that this is not the case by relating Bonet's inequality to the CHSH inequality. That such a link must exist also follows directly from the fact that all (non-trivial) facets of the |X|=3, |Y |=|A|=|B|=2 classical Bell polytopes are liftings of the CHSH inequality [36].
Although we found inequality (7) by taking the convex hull of deterministic strategies and without regard to the relationship between the Bell and Instrumental scenarios, it is enlightening to retrodictively explain I Bonet as a projection of the classical Bell scenario polytope.
Let us rewrite the expression I Bonet per (7) in terms of p(ab|xy); that is, let us interpret the facet of the classical Instrumental polytope as a valid inequality for the Bell polytope. This operation is a trivial lifting of the inequality via the mapping p(ab|x) → p(ab|x, y=a). We find that Using the normalization and no-signalling constraints satisfied by the Bell scenario probabilities p AB|XY , we can rewrite this last expression as with A x B y the expectation value of AB given that X and Y take values x and y. From this retrodiction it becomes immediately clear that the classical, quantum, and GPT bounds of bounds I Bonet are as well as the fact that −p(11|20) ≤ 0 in all physical theories.
A perhaps surprising consequence of the retrodictive interpretation of I Bonet is that any nonclassical correlation in the CHSH Bell scenario can be used as a resource to generate nonclassical correlations in the Instrumental scenario, despite the fact that the Instrumental scenario has coinciding GPT and classical polytopes for |X|=|A|=2. The trick which allows us to map arbitrary non-classical No-Signalling correlations in the Bell scenario where |X|=|Y |=|A|=|B|=2 into non-classical correlations in the Instrumental scenario is as follows: Starting from a p AB|XY in the standard CHSH scenario where x ∈ {0, 1}, trivially map it to p AB|XY in an extended scenario where x ∈ {0, 1, 2} by setting p (ab|xy) = p(ab|xy) when x=0, 1 and p (ab|x=2, y) = δ a,0 p(b|y) when x=2. That is, in the case x = 2, the output a is deterministically equal to 0. Then we have p (11|20) = 0 and thus we may substitute to recast Lifting [I Bonet ] p for |X| = 3 as an explicit function of p for |X| = 2, with the trivial intermediate map p → p taken for granted: In particular, this trivial map allows us to relate the extent of the violation of I Bonet ≤ 2 in the Instrumental scenario entirely as a function of the extent of the violation of CHSH ≤ 2 in the Bell scenario. A direct consequence is that any non-classical correlations in the |X| = |A| = |Y | = |B| = 2 Bell scenario, which necessarily violate the CHSH inequality, give rise to non-classical correlations violating Bonet's inequality in the Instrumental scenario.
Another way to express this connection is as follows. Writing p in term of p, we can rewrite the relation (8) explicitly as the identity Thus, instead of testing CHSH in the regular way, which involves estimating the correlations for 4 choices of input pairs (x, y) ∈ {(0, 0), (0, 1), (1, 0), (1, 1)}, one can alternatively test it using 3 choices of an input z. (i) If z=0, 1, one uses x=z on Alice's side and uses Alice's outputs as an input for Bob. This allows to evaluate the first four terms on the righ-hand side of (9). (ii) If z=2, one uses y=0 as an input for Bob and registers his output (without testing Alice). This allows to evaluate the last term of (9).

VI. TILTED INSTRUMENTAL INEQUALITIES
The results of the last section show that, at least in the specific input-output configuration we considered, the Instrumental scenario is essentially equivalent to the Bell scenario for the mere purpose of detecting nonclassicality, i.e., correlations in the CHSH Bell scenario are non-classical if and only if they give rise to non-classical correlations in the Instrumental scenario.
However, it is well known that many interesting properties of non-classical correlations do not merely reduce to testing their non-classicality. For example, in the CHSH Bell scenario, deciding if given correlations p are nonclassical can entirely be decided by testing the CHSH inequality, and therefore other types of inequalities, such as the tilted CHSH inequalities introduced in [37] are irrelevant from this perspective. However, such tilted CHSH inequalities are useful for other purposes. For instance they can be used to certify in a device-independent setting more randomness that would be possible using the CHSH inequality [37,38].
Similarly, the Instrumental scenario may provide new interesting features not directly present in the Bell scenario. As an example, notice that the smallest Bell scenario relies on the violation of the CHSH inequality and hence need two random bits at the inputs (|X|=|Y |=2). Interestingly, we have seen that the same non-classical resources (using the same quantum state and measurements) can be tested in the Instrumental scenario using only a single random trit (|X|=3). Indeed, we have seen that we can simply interpret the Instrumental Bonet's inequality as a new way to test the violation of the CHSH inequality.
This could potentially have implications for deviceindependent (DI) randomness certification [39,40] as it could lead to schemes generating the same outputs randomness as a standard CHSH protocol but consuming less input randomness. This requires further examination, however, as many parameters, such as the amount by which the random inputs can be biased (resulting in a consumption of randomness per round inferior to 2 bits even using the CHSH inequality) or the sensitivity to statistical fluctuations, determine the randomness expansion rate and such parameters might be more favorable in the standard Bell scenario.
In the Bell scenario, the tilted CHSH inequalities have useful properties from the perspective of randomness certification. Interestingly, we now show that such inequalities admit an Instrumental version through a link analogous to the one relating the CHSH and Bonet's inequalities.
Let us start with the Instrumental version of the tilted CHSH inequalities. For α ≥ 1, define Whenever α = 1, we recover Bonet's expression (7). I α defines valid (though not necessarily facet-defining) inequalities for the Instrumental scenario, and the techniques presented above can be used to relate its lifting to tilted CHSH inequalities and derive its classical, quantum, and GPT bounds. In particular, one can show that where The tilted CHSH inequalities were introduced in Ref. [37], and have the following properties Hence, the corresponding bounds for the tilted instrumental inequalities read and they can be achieved by using the same trivial mapping from the Bell scenario to the Instrumental scenario. We see then that these tilted inequalities admit not just quantum violations, but also a GPT violation beyond the quantum bound, for all α ≥ 1. It was shown in Ref. [37] that correlations maximally violating the tilted CHSH inequality of Eq. (10) can be used to certify 2 − ln(2)/α bits of randomness, which can be made arbitrarily close to 2 by increasing α. By exploiting the link with I α , the same amount of randomness can be certified by violating a tilted Instrumental inequality instead. As here, at most a single trit of randomness is used at the input, while 2 − bits can be certified at the ouput (for arbitrarily small), we can generate two random bits for the price of (at most) one trit.

VII. DISCUSSION
The original motivation of Bell was to provide a testable criterion for whether Nature is compatible with a classical local causal description. In such an experiment, ideally one does not wish to make any assumptions about the nonexistence of spurious communication channels between the parties, which could be mediated via as-yet-undiscovered physics. To rule out communication, which could explain the observed correlations, one can instead arrange to have space-like separation of the different parties' measurement events. Any communication would then need to be superluminal, in violation of special relativity. The minimal causal structure in which such an experiment can be implemented is that of Fig. 1(a), and the minimal scenario is that of CHSH (binary inputs and outputs for each party). Several conclusive tests imposing space-like separation have recently been realised [41][42][43].
However, one of the consequences of Bell nonlocality is to enable device-independent (DI) information processing. Conditioned on the violation of a Bell inequality, it becomes possible to certify the security or correct functioning of an information processing protocol, without any detailed knowledge of its implementation [39,40,[44][45][46][47][48][49][50]. Prominent examples are quantum key distribution and random number expansion and amplification. In DI settings, it is typically assumed that devices are shielded, i.e. that the experimenters control the inputs which enter into the devices, and that the devices do not leak information on spurious side channels. For DI information processing therefore, the minimal non-trivial setting is the Instrumental scenario Fig. 1(b) considered here.
We have shown here that the Bell and Instrumental scenarios are closely related. Though correlations in the simplest Bell scenario, the CHSH scenario, always admit a classical model if they are directly projected on the Instrumental scenario, we have shown that their non-classical nature is entirely preserved in the Instrumental scenario provided some purely classical local processing is first applied on Alice's side. This finding has important implications: given some non-classical resource p(ab|xy) in an arbitrary Bell scenario, determining whether this resource gives rise to a non-classical behavior in the Instrumental scenario cannot simply be answered by considering the Instrumental probabilities p(ab|x) = p(ab|x, y = a) (and determining if they are in the classical Instrumental polytope, e.g., using linear programming). Instead, one should also take into account all possible local classical transformations that can be applied to the given non-classical correlation p. By failing to consider such trivial, free transformations of a correlation one obtains false negatives from the standard causal inference toolscorrelations appear to be compatible with the classical Instrumental DAG, but actually are not 3 . This observation applies to other DAG derived from Bell-type scenarios, such as the Signalling-Between-Outputs scenario of Fig. 2.
Another outcome of our results is that the Instrumental versions of the CHSH and tilted CHSH inequalities require less input choices than their standard Bell versions. We have briefly discussed the potential of such Instrumental inequalities for DI randomness certification, but this is an issue that deserves further investigation.
From a fundamental point of view, we have identified a fully device-independent scenario (in particular, that does not rely on several independent hidden sources [51]) that require three random input choices only, whereas the CHSH scenario requires in total four (2 × 2) random input choices. We leave it as an open question whether it is possible to find a DI scenario where a random choice between two values only is sufficient to observe non-classical correlations.

NOTE ADDED
The results presented here partly overlap with those obtained independently in [52], where Bonet's inequality and the violating quantum and GPT correlations of Section IV were also introduced, but where the one-toone relation between Bonet's inequality and the CHSH inequality presented in Section V was not noticed. All our results up to Section V have been orally presented by S.P. at the Quantum Networks 2017 Workshop, Oxford (UK) in August 2017.