Quantifying Bell: the Resource Theory of Nonclassicality of Common-Cause Boxes

We take a resource-theoretic approach to the problem of quantifying nonclassicality in Bell scenarios. We conceptualize the resources as probabilistic processes from the setting variables to the outcome variables having a particular causal structure, namely, one wherein the wings are only connected by a common cause. We term them ``common-cause boxes''. We define the distinction between classical and nonclassical resources in terms of whether or not a classical causal model can explain the correlations. We then quantify the relative nonclassicality of resources by considering their interconvertibility relative to the set of operations that can be implemented using a classical common cause (which correspond to local operations and shared randomness). We prove that the set of free operations forms a polytope, which in turn allows us to derive an efficient algorithm for deciding whether one resource can be converted to another. We moreover define two distinct monotones with simple closed-form expressions in the two-party binary-setting binary-outcome scenario, and use these to reveal various properties of the pre-order of resources, including a lower bound on the cardinality of any complete set of monotones. In particular, we show that the information contained in the degrees of violation of facet-defining Bell inequalities is not sufficient for quantifying nonclassicality, even though it is sufficient for witnessing nonclassicality. Finally, we show that the continuous set of convexly extremal quantumly realizable correlations are all at the top of the preorder of quantumly realizable correlations. In addition to providing new insights on Bell nonclassicality, our work also sets the stage for quantifying nonclassicality in more general causal networks.

We take a resource-theoretic approach to the problem of quantifying nonclassicality in Bell scenarios. We conceptualize the resources as probabilistic processes from the setting variables to the outcome variables having a particular causal structure, namely, one wherein the wings are only connected by a common cause. We term them "commoncause boxes". We define the distinction between classical and nonclassical resources in terms of whether or not a classical causal model can explain the correlations. We then quantify the relative nonclassicality of resources by considering their interconvertibility relative to the set of operations that can be implemented using a classical common cause (which correspond to local operations and shared randomness). We prove that the set of free operations forms a polytope, which in turn allows us to derive an ef-ficient algorithm for deciding whether one resource can be converted to another. We moreover define two distinct monotones with simple closed-form expressions in the two-party binary-setting binary-outcome scenario, and use these to reveal various properties of the pre-order of resources, including a lower bound on the cardinality of any complete set of monotones. In particular, we show that the information contained in the degrees of violation of facet-defining Bell inequalities is not sufficient for quantifying nonclassicality, even though it is sufficient for witnessing nonclassicality. Finally, we show that the continuous set of convexly extremal quantumly realizable correlations are all at the top of the preorder of quantumly realizable correlations. In addition to providing new insights on Bell nonclassicality, our work also sets the stage for quantifying nonclassicality in more general causal networks.

Introduction
Bell's theorem [1,2] highlights a precise sense in which quantum theory requires a departure from a classical worldview. Furthermore, violations of Bell inequalities provide a means for certifying the nonclassicality of nature, independently of the correctness of quantum theory. This is because Bell inequalities can be tested directly on experimental data. Experimental tests under very weak assumptions have confirmed this nonclassicality [3][4][5]. Correlations that violate Bell inequalities have also found applications in information theory. Specifically, they constitute an information-theoretic resource insofar as they can be used to perform various cryptographic tasks in a device-independent way [6][7][8][9][10][11][12][13][14]. Consequently, much previous effort has been made to quantify resourfulness of correlations within Bell scenarios [15][16][17][18][19][20][21][22].
In this paper, we take a resource-theoretic approach to quantifying the nonclassicality of a given correlation in a Bell scenario, grounded in a new perspective Bell's theorem. This is the perspective of causal modelling, which differs from the traditional operational approaches both conceptually and in practice. Nevertheless, the natural choice of the set of free operations for the Bell scenario in our framework coincides with the one proposed in some previous works [16,17], namely, local operations and shared randomness (LOSR) 1 .
Our causal perspective on quantifying Bell nonclassicality also generalizes naturally to a framework for quantifying the nonclassicality of correlations in more general causal scenarios. We discuss this generalization in Section A.2, but leave its development to future work.

Summary of main results
We now summarize the content and main results of our article.
In Section 2, we articulate the view on Bell's theorem that motivates our approach-the causal modelling paradigm-and contrast it with two other 1 There is widespread agreement that the free operations should somehow consist of local operations supplemented with shared randomness, however, different authors have been led to formalize this idea differently, that is, they have been led to distinct proposals for the the set of free operations. Indeed, the formalization provided in Refs. [18,20,21] is inconsistent with the one given in Refs. [16,17] and therefore also with the one presented here. A detailed discussion of this issue can be found in Appendix A.2.
views on Bell's theorem, namely, the strictly operational and superluminal causation paradigms. In particular, we explain how the differences between these views impacts how one conceptualizes Bell inequality violations as a resource, and we highlight some of the advantages of our approach relative to the alternatives. We also introduce the notion of partitioned process theories [23] as the mathematical framework for resource theories that we adopt in this article.
In Section 3, we provide a formal definition of the resource theory to be studied. For bipartite Bell scenarios, we argue that the set of processes which naturally constitute the resources in our approach is the set of all bipartite processes with classical inputs and outputs that can arise within a causal model with a (possibly nonclassical) common cause between the wings. We also argue that the natural set of free operations on such processes are those that are achieved by embedding the process in a circuit for which the only connection between the wings is a classical common cause, and we demonstrate that this is equivalent to the set of local operations and shared randomness, as the latter is formalized in Refs. [16,17].
In Section 4, we introduce some of the central concepts of any resource theory, including the notion of a pre-order and its features, the notion of monotones and complete sets thereof, and the notions of cost and yield monotones, which underlie the explicit monotone constructions that follow.
In Section 5, we show how one can use two instances of a linear program to determine the ordering relation which holds between any pair of resources (see Proposition 10 and the discussion that follows it).
In Section 6, we define two monotones of particular interest. The first (defined in Eq. (33)) is based on a yield construction relative to all resources in the Clauser-Horne-Shimony-Holt (CHSH) scenario [24] (a bipartite Bell scenario where the settings and outcomes all have cardinality two) and where the yield is measured by the value of the canonical CHSH functional. The second (defined in Eq. (36)) is based on a cost construction relative to a one-parameter family of resources in the CHSH scenario and where the cost is measured again by the value of the canonical CHSH functional. Although both of these monotones are originally defined in terms of an optimization problem, we derive closed-form expressions for each of them for resources within the CHSH scenario (see Propositions 12 and 14 respectively). We show that within the CHSH scenario [24], a variety of monotones which have been previously studied are all equivalent (up to a monotonic function) to the first of these monotones (see Corollary 13). Because our two monotones are provably not equivalent, this result implies that the second of our monotones provides information beyond that given by previously studied monotones.
In Section 7 we leverage our two monotones to derive various global properties of the pre-order induced by single-copy deterministic conversions. Specifically, we prove that the pre-order: • is not complete (i.e., there exist incomparable resources), • is not weak (the incomparability relation is not transitive), • has both infinite width and infinite height, • is locally infinite. We also prove that the two monotones just mentioned do not completely characterize the pre-order of resources, by showing that they fail to do so even for the special case of the CHSH scenario. We further show (in Theorem 21) that no fewer than eight continuous monotones can do the job. We also show (in Proposition 18) that the equivalence classes among nonfree resources in the CHSH scenario (though not in general) are given exactly by the orbits of the symmetry group of deterministic free operations.
Finally, in Section 8, we show that all of the global features of the pre-order hold even for the strict subset of resources which can be realized in quantum theory. We also prove (in Lemma 22) that every extremal quantumly realizable resource is at the top of the pre-order of quantumly realizable resources, and (in Proposition 23) that there are a continuous set of incomparable resources at the top of this pre-order.

How to read this article
We will demonstrate in Section 3 that in spite of the difference in our attitude towards Bell's theorem, the definition of the set of resources and the set of free operations that is natural for the Bell scenario within the causal modelling paradigm coincides with a definition that has been proposed within the strictly operational paradigm, namely, the one proposed in Refs. [16,17]. Because Bell scenarios are the focus of our article, any reader who would rather take the strictly operationalist attitude towards Bell's theorem can reinterpret all of our results through that lens. In particular, readers who are already sympathetic to the notion that LOSR, as defined in Refs. [16,17], is the right choice of free operations may wish to skip Sections 2 and 3.
To understand our conviction that LOSR constitutes the right choice of free operations for Bell scenarios, however, readers are advised to read Sections 2 and 3.2. In particular, to understand how our approach differs (advantageously) from other approaches, readers are encouraged to examine Sections 2.4 and 2.5 as well as Appendix A.
Because Section 4 reviews basic definitions and terminology for concepts related to resource theories, any reader who has expertise on resource theories may wish to skip this section. We note, however, that some of the material presented therein is not found in standard treatments, such as our discussion of global properties of a pre-order and our discussion of a scheme for constructing useful cost and yield monotones.
The presentation of our novel technical results begins in Section 5.
2 Motivating our approach and contrasting it with alternatives

Three views on Bell's theorem
The traditional commentary on Bell's theorem [25,26] takes a particular view on how to articulate the assumptions that are necessary to derive Bell inequalities. Among these assumptions, two are typically highlighted as deserving of the most scrutiny, namely, the assumptions that are usually termed realism and locality 2 . Abandoning one or the other of these two assumptions is the starting point of most commentaries on what to do in the face of violations of Bell inequalities. 3 Furthermore, a schism seems to have developed between the camps that advocate for each of these two views [27].
Among the researchers who take Bell's theorem to demonstrate the need to abandon realism, there is a contingent which adopts a purely operational attitude towards quantum theory, that is, an attitude wherein the scientist's job is merely to predict the statistical distribution of outcomes of measurements performed on specific preparations in a specified experimental scenario. We shall refer to the members of this camp as operationalists [28]. For such researchers, a violation of a Bell inequality is simply a litmus test for the inadequacy of a classical realist account of the experiment. One particular type of operationalist attitude, which we shall term the strictly operational paradigm, advocates that physical concepts ought to be defined in terms of operational concepts, and consequently that any properties of a Bell-type experiment, such as whether it is signalling or not and what sorts of causal connections might hold between the wings, must be expressed in the language of the classical input-output functionality of that experiment. In other words, they advocate that the only concepts that are meaningful for such an experiment are those that supervene 4 upon its input-output functionality. 5 Most prior work on quantifying the resource in Bell experiments has been done within this paradigm, and the characteristic of experimental correlations that is usually taken to quantify the resource is simply some notion of distance from the set of correlations that satisfy all the Bell inequalities.
Consider, on the other hand, the researchers who take realism as sacrosanct, and in particular those who take Bell's theorem to demonstrate the failure of locality-that is, the existence of superluminal causal influences [30,31]. 6 Researchers in this camp, whom we shall refer to as advocates of the superluminal causation paradigm, would presumably find it natural to quantify the resource of Bell inequality violations in terms of the strength of the superluminal causal influences required to account for them (within the framework of a classical causal model). An approach along these lines is described 4 A-properties are said to supervene on B-properties if every A-difference implies a B-difference. 5 Some might describe what we have here called the strictly operational paradigm as the "device-independent" paradigm [29], however, we avoid using the latter term here because its usage is not restricted to describing a particular type of empiricist philosophy of science: it also has a more technical meaning in the context of quantum information theory, wherein it indicates whether or not a given information-theoretic protocol depends on a prior characterization of the devices used therein. Indeed, Bell-inequality-violating correlations have been shown to be a key resource in cryptography because they allow for device-independent implementations of cryptographic tasks [6][7][8][9][10][11][12][13][14]. 6 Although such influences do not imply the possibility of superluminal signalling, they do imply a certain tension with relativity theory if one believes that the latter does not merely concern anthropocentric concepts such as signalling, but also physical concepts such as causation.
in Refs. [32,33]. Earlier work on the communication cost of simulating Bell-inequality violations [34,35] is also naturally understood in this way. 7 In recent years, a third attitude toward Bell's theorem-inspired by the framework of causal inference [39]-has been gaining in popularity. In this approach, the assumptions that go into the derivation of Bell inequalities are [40]: Reichenbach's principle (that correlations need to be explained causally), the framework of classical causal modelling, and the principle of no fine-tuning (that statistical independences should not be explained by fine-tuning of the values of parameters in the causal model). Here, a violation of a Bell inequality does not lead to the traditional dilemma between realism and locality, but rather attests to the impossibility of providing a nonfine-tuned explanation of the experiment within the framework of classical causal models. This attitude implies the possibility of a new option for what assumption to give up in the face of such a violation. Specifically, the new possibility being contemplated is that one can hold fast to Reichenbach's principle and the principle of no fine-tuning-and hence to the possibility of achieving satisfactory causal explanations of correlations-by replacing the framework of classical causal models with an intrinsically nonclassical generalization thereof.
As is shown in Ref. [40], because the correlations in a Bell experiment do not provide a means of sending superluminal signals between the wings, the only causal structure that is a candidate for explaining these correlations without fine-tuning is one wherein there is a purely common-cause relation between the wings, that is, one which admits no causal influences between the wings. Therefore, the new approach to achieving a causal explanation of Bell inequality violations is one that posits a common cause mech-7 A less common view on how to maintain realism in the face of Bell inequality violations is to hold fast to locality but give up on a different assumption that goes into the derivation of Bell inequalities, namely, that the hidden variables are statistically independent of the setting variables. This is known as the "superdeterministic" response to Bell's theorem [36]. Advocates of this approach would presumably find it natural to quantify the resource of Bell inequality violations in terms of the deviation from such statistical independence that is required to explain a given violation. In particular, the results of Refs. [37] and [38] seeking to quantify the nonindependence needed to explain a given Bell inequality violation might be framed within a resource-theoretic framework. However, given that the setting variables can no longer be considered as freely specifiable within such an approach, it would be inappropriate to conceptualize a Bell experiment as a box-type process as we have done here.
anism but replaces the usual formalism for causal models with one which allows for more general possibilities on how to represent its components [41] 8 . We refer to this attitude as the causal modelling paradigm.
The causal modelling paradigm implies not only a novel attitude towards Bell's theorem, but also a change in how one conceives of the resource that powers the information-theoretic applications of Bell-inequality violations. The resource is not taken to be some abstract notion of distance from the set of Bell-inequality-satisfying correlations within the space of all nonsignalling correlations, as advocates of the strictly operational paradigm seem to favour, nor to consist of the strength of superluminal causal influences, as advocates of the superluminal causation paradigm would presumably have it. Rather, we take the resource to be the nonclassicality required by any generalized causal model which can explain the Bell inequality violations without fine-tuning.
We shall show that in the resource theory that emerges by adopting this attitude, the nonclassicality of common-cause processes in Bell experiments cannot be captured solely by the degree of violation of facet-defining Bell inequalities. That is, there are distinctions among such common-cause processesdifferent ways for these to be nonclassical-which do not correspond to distinctions in the degree of violation of any facet-defining Bell inequality.

Generalized causal models
We will work with the notion of a generalized (i.e., not necessarily classical) causal model that has been developed in Refs. [42,43] using the framework of generalized probabilistic theories (GPTs) [44][45][46]), and refer to it as a GPT causal model. Since we are interested in the distinction between classical and nonclassical, without specifically distinguishing quantum and supra-quantum types of nonclassicality, we will not be making use of any of the recent work [41,47,48] on devising an intrinsically quantum notion of a causal model. 9 8 Specifically, in the proposal of Ref. [41], reversible deterministic causal dependences are represented by unitaries rather than bijective functions, and lack of knowledge is represented by density operators rather than by classical probability distributions. 9 However, we will consider the question of when certain correlations that arise in a GPT causal model can be quantumlyrealized.
One can then approach the study of nonclassicality in arbitrary causal structures from within the scope of these GPT causal models, and pursue the development of a resource theory of such nonclassical features. One must simply specify the nature of the scenario being considered: the number of wings of the experiment (commonly conceptualized as the laboratories of different parties when discussing information-theoretic tasks), and the causal structure presumed to hold among these wings. 10 The set of all resources one might contemplate are then the set of processes that can be described with a GPT causal model having the appropriate causal structure. In this article, we focus on the causal structure wherein there is a common cause that acts on all of the wings, but no causal influences between any of them, which we term a Bell scenario. However, we do include some discussion regarding other causal structures in Appendix A.3.
We conceptualize any experimental configuration as a process from its inputs to its outputs. In the GPT framework for causal models, one has the capacity to consider processes that have GPT systems as inputs and outputs at the various wings. However, we will restrict our attention to processes that have only classical inputs and outputs. Such processes can be conceptualized as black-box processes, to which one inputs classical variables and from which classical variables are output. They are therefore precisely the sorts of processes considered in the device-independent paradigm. We further restrict our attention to processes with a classical input and classical output at each wing, where the input temporally precedes the output. 11 In the deviceindependent paradigm, the term "box" is generally used as jargon for such processes (for instance, as it is used in the term "PR box" [51]). We therefore refer to such processes as box-type processes or simply boxes. A box-type process is completely characterized by specifying the conditional probabil-ity distribution over its outcome variables given its input variables.
We use the term common-cause box to refer to box-type processes which can be realized using a causal structure consisting of a common cause acting on all of the wings. In GPT causal models, all common-cause boxes can be decomposed into the preparation of a GPT state on a multipartite system, followed by the distribution of the component subsystems among the wings, followed by each subsystem being subjected to a GPT measurement, chosen from a fixed set according to the classical input variable at that wing (the local setting variable), and the result of which is the classical output variable at that wing (the local outcome variable). In short, such processes can be decomposed in the same manner in which a multipartite Bell experiment is decomposed into a preparation of a correlated resource and local measurements. The distinction between classical and nonclassical common-cause boxes is simply the distinction between whether there is a classical causal model underlying the process, or whether one must resort to a causal model which invokes a nonclassical GPT.

Resourcefulness in the causal modelling paradigm
In order to quantify the nonclassicality of commoncause boxes, we will use the approach to resource theories described in Ref. [23]. In this approach, resource theories are defined via partitioned process theories. An enveloping theory of processes must be specified, together with a subtheory of processes that can be implemented at no cost, called the free subtheory of processes. This partitions the set of all processes in the enveloping theory into free and costly (i.e., nonfree) processes. One can then ask of any pair of processes in the enveloping theory whether the first can be converted to the second by embedding it in a circuit composed of processes that are drawn entirely from the free subtheory. The set of higher-order processes which are realized in this way-i.e., by embedding in a circuit composed of processes drawn from the free subtheory-is termed the set of free operations. Pairwise convertibility relations under the set of free operations define a pre-order on the set of all resources, and a partial order over the equivalence classes of such resources. One can then quantify the relative worth of different resources by their relative positions in this partial or-der. Functions over resources that preserve ordering relations, termed monotones, provide a particularly simple means of quantifying the worth of resources.
The resource theory considered in this article is defined as follows. We take the enveloping theory of processes to consist of the common-cause boxes that can be realized in a GPT causal model, which we term GPT-realizable. We take the free subtheory of processes to consist of the common-cause boxes that can be realized in a classical causal model, which we term classically realizable.
It follows that the free common-cause boxes are precisely those that satisfy all the Bell inequalities, while the costly common-cause boxes are those that violate some Bell inequality. To determine the ordering relations that hold among these common-cause boxes, one must determine the convertibility relations among them. Given the definition of our resource theory, whether one common-cause box can be converted to another is determined by whether this can be achieved by composing it with classical common-cause boxes. This subsumes correlated local processings of the inputs and outputs of the box, as we describe in Section 3.2.

A note about nomenclature
In this article, we avoid describing the resource behind Bell inequality violations as "nonlocality". This is because we believe that it is only for those who take the lesson of Bell's theorem to be the existence of superluminal causal influences that it is appropriate to describe violations of Bell inequalities by this term. Researchers in the operationalist camp have not, generally speaking, avoided using the term "nonlocality", but seem instead to use it as a synonym for "violation of a Bell inequality" rather than to imply a commitment to superluminal causal influences. However, we believe that such a usage invites confusion and so we opt instead to avoid the term altogether. Nevertheless, our project is very much in line with earlier projects that describe themselves as developing a "resource theory of nonlocality", such as Refs. [15][16][17][18].

Contrast to the strictly operational paradigm
As noted in the introduction and as will be demonstrated in Section 3.2, in the special case of Bell scenarios-the focus of this article-the natural set of free operations within our causal modelling paradigm is equivalent to one of the proposals for the set of free operations made in earlier works within the strictly operational paradigm, namely, local operations and shared randomness (LOSR), as the latter is defined in Refs. [16,17]. Additionally, the natural enveloping theory adopted in the strictly operational approach, namely, the set of no-signalling boxes, also coincides with that of our enveloping theory for the case of Bell scenarios, namely, the set of GPT-realizable common-cause boxes (where the equivalence of these two sets can be inferred from the results of Ref. [45]). Therefore, in spite of the difference in the attitude we take towards Bell's theorem, the resource theory that we define for Bell scenarios is the same as the one studied in Refs. [16,17].
Nonetheless, the difference in our attitude towards Bell's theorem is not inconsequential. We presently outline its significance for the project of this article as well as for potential future generalizations of this project.
Most importantly, the causal modelling approach diverges sharply from any strictly operational approach once one considers causal structures beyond Bell scenarios. As discussed in Appendix A.3, in a resource theory of nonclassicality for more general causal structures, both the free subtheory and the enveloping theory proposed by the causal modelling approach are radically different from those suggested by the strictly operational approach. In particular, the free subtheory need not be LOSR in a general causal structure and the enveloping theory need not be the set of all nonsignalling operations. Our approach allows us to define a resource theory that is specific to a scenario in which only strict subsets of the wings are connected by common causes [43,52] (such as the triangle-with-settings scenario described in Appendix A.3) and this provides a concrete example of a case where the free subtheory is not LOSR and the enveloping theory is not all nonsignalling operations. In these cases, the free operations are"local operations and causally admissable shared randomness",wherein only those subsets of wings that are connected by a common cause have shared randomness. This is distinct from the LOSR operations, which assume that randomness is shared between all the wings. It seems unlikely that the resource theory we propose in these cases can be motivated (or even fully characterized) in the strictly operational paradigm.
Even for Bell scenarios, however, the causal modelling approach offers advantages over its competi-tors. In particular, it singles out a unique set of free operations, while the strictly operational approach does not. From our perspective, the resource underlying Bell inequality violations is the nonclassicality of the causal model required to explain them with a common cause, so clearly the free operations should involve only classical common causes acting between the wings. In the strictly operational paradigm, by contrast, any operation which preserves no-signalling and takes local boxes to local boxes might constitute a legitimate candidate for a "free" operation. This ambiguity is reflected in the existence of distinct proposals for the set of free operations in strictly operational resource theories. Aside from LOSR, there is also a proposal called wirings and prior-to-input classical communication (WPICC) [18] which allows for classical causal influences among the wings prior to when the parties receive their inputs (See Appendix A.1). If one believes that there is a singular concept which underlies the violation of Bell inequalities, then at most one of these proposals (LOSR or WPICC) can be taken as the relevant set of free operations. 12 Although WPICC operations meet all desired operational criteria, they are immediately ruled out as candidates for the free operations within the causal modelling paradigm, on the grounds that they involve nontrivial cause-effect influences between the wings.
Another advantage of our approach for the Bell scenario is that it highlights the fact that LOSR is by construction a convex set, a fact which is critical for the algorithmic method that we derive for determining the ordering relation between any two resources. In highlighting this fact, our approach led us to notice an oversight in some previous attempts to formalize LOSR, as discussed in Appendix A.2.
Finally, we note that prior work of Geller and Piani [17] departs from the strictly operational paradigm through their use of the unified operator formalism [53,54], which is analogous to the quantum formalism, but where nonpositive Hermitian operators are allowed to represent states. They do not characterize boxes primarily by their input-output functionality, but rather as a composition of a bipartite source with local measurements. Indeed in their Fig. 4, they explicitly depict the internal structure of the box. It is in this sense that their approach 12 Competing sets of free operations may be interesting for studying phenomena other than the resource powering violations of Bell inequalities, but this is not the issue at stake in this article.
does not quite fit the mould of a strictly operational approach but is rather somewhat more in the flavour of the causal modelling approach we have described here. Nonetheless, the unified operator formalism differs significantly from the GPT formalism of Refs. [42,43] with respect to the independence of the nonclassical common cause from the measurements employed in realizing nonclassical boxes. In the unified operator formalism, the Hermitian operator describing the shared state cannot be chosen freely for a given set of quantum measurements, because some choices would yield negative numbers rather than valid probabilities. By contrast, in the GPT formalism that we adopt here, the set of GPT states is contained within the dual of the set of GPT product measurements, and hence any measurement scheme can be paired with any shared state while yielding valid probabilities. The causal modelling paradigm must reject any dependence of the shared state on the choice of measurements, while such dependence is unavoidable within the unified operator formalism. As defined in Ref. [39], a causal model is a directed acyclic graph, or equivalently, a circuit of causal processes, wherein the distinct processes in the circuit are required to be autonomous (i.e., independently variable). We therefore classify Ref. [17] as neither within the causal modelling paradigm nor within the strictly operational paradigm, while still exhibiting some features of each of these approaches.

Contrast to the superluminal causation paradigm
To our knowledge, advocates of the superluminal causation paradigm have not attempted to develop a resource theory for Bell inequality violations (although Refs. [32,33] are related in spirit). If it were attempted (within the framework of Ref. [23]), then the commitments of the approach suggest that it would also be done differently from the way we have done so here. Those who endorse the superluminal causation paradigm do not shy away from the notion of causation, and hence a resource theory developed within their paradigm could be presented using the same framework that we use here -that of causal models. However, such an approach would likely be framed entirely in terms of classical causal models, rather than introducing the notion of GPT causal models.
Advocates of the superluminal causation paradigm would naturally define the free boxes to be those that involve only subluminal causes. Hence, in scenarios wherein the inputs and the outputs at one wing are space-like separated from those at the other wings, so that subluminal causal influences cannot act between the wings, a box is free if and only if it can be realized by a classical common cause. Thus, the natural choice of the free subtheory in the superluminal causation paradigm coincides with the free subtheory in the causal modelling paradigm. On the other hand, the natural choice of the enveloping theory in the superluminal causation paradigm consists of the set of boxes that are classically realizable given superluminal causal influences between the wings. This differs from the enveloping theory in the causal modelling paradigm because it includes boxes that are signalling. In the superluminal causation paradigm, therefore, it is natural to try and quantify the resource in terms of the strength of the superluminal causal influence between the wings that is required to explain it in a classical causal model. 13 Because the enveloping theory within this paradigm includes not only non-signalling boxes that violate Bell inequalities but signalling boxes as well, the resource theory is rich enough to describe communication between the wings. Therefore, defining the resource theory in this way would not distinguish classical and nonclassical common-cause resources (as we propose to do here), but would instead draw a line between classical common-cause resources and everything else -including classical signalling resources. 14 If one were to go this route, then all of classical Shannon theory would be subsumed in the resource theory. A potential response to this expansion in the scope of the project might be to try to eliminate such signalling resources by hand, by demanding that the enveloping theory was constrained to those boxes that are non-signalling among the wings. Such a response, however, seems to compromise the ideals of the superluminal causation paradigm, because no-signalling is an opera- 13 It should be noted that no finite speed of superluminal causal influences can satisfactorily account for the predictions of quantum theory, per Ref. [55], so such influences would need to be assumed to be of infinite speed.
14 Note, therefore, that if one seeks to partition resources of a given type into classical and nonclassical varieties, then defining the enveloping theory correctly is just as important as defining the free subtheory correctly.
tional notion rather than a realist one. 15 3 The free and enveloping theories that define our resource theory

Classical and nonclassical common-cause boxes
We begin by formalizing the relevant definitions from the previous section. For ease of presentation, we focus throughout on the bipartite Bell scenario, but the multipartite Bell scenario can be formalized analogously. Fig. 1(a) depicts the structure of a generic GPTrealizable common-cause box. The classical variables that range over the (fixed) choices of local measurements are termed the setting variables, denoted S (left wing) and T (right wing), while the classical variables that range over the possible results of these measurements are termed the outcome variables, denoted X (left wing) and Y (right wing). 15 John Bell famously argued against the idea that no-signalling could embody an assumption of locality in a fundamental physical theory on the grounds that it was too anthropocentric [56]: ...the "no signaling" notion rests on concepts which are desperately vague, or vaguely applicable. The assertion that "we cannot signal faster than light" immediately provokes the question: Who do we think we are? We who can make "measurements", we who can manipulate "external fields", we who can "signal" at all, even if not faster than light? Do we include chemists, or only physicists, plants, or only animals, pocket calculators, or only mainframe computers? Figure 1: The distinction between (a) a generic GPTrealizable common-cause box and (b) a classical commoncause box. Here, single-line edges denote classical systems, and single-line boxes denote processes that have only classical inputs and outputs (depicted in light blue). Double-line edges denote nonclassical systems and double-line boxes denote processes that have one or more nonclassical inputs or outputs (depicted in pink). Any common-cause box consistent with an internal structure of the type indicated in (b) is termed classical, while a common-cause box that is not consistent with the structure of (b) but instead is only consistent with an internal structure of the type indicated in (a) is termed nonclassical. If we further particularize to the CHSH scenario, where the cardinalities of both setting and outcome variables is 2, then the type is ( 2 2 2 2 ) . Let us label the system distributed to the left wing by A and the one to the right wing by B. In the GPT framework, states and effects on A (B) are represented by vectors in a real vector space of di- . States and effects on the composite AB are represented by vectors in the tensor product of these vector spaces, If the GPT representation of the X = x outcome of the S = s measurement on system A is r A x|s ∈ R d A and that of the Y = y outcome of the T = t measurement on system B is r B y|t ∈ R d B , and if s AB ∈ R d A ⊗ R d B denotes the GPT state of the composite AB, then the conditional probability distribution associated to this GPT-realizable common-cause box is where · denotes the Euclidean inner product.
By virtue of their internal causal structure, all GPT-realizable common-cause boxes satisfy the nosignalling conditions P Y |ST = P Y |T and P X|ST = P X|S . It is straightforward to verify that this follows from Eq. (1) using the fact that x r A x|s = u A , where u A is the unique deterministic effect on A, which is independent of value s of the setting variable, and using the analogous fact for B.
The common-cause boxes that are considered to be free in our resource theory are those that can be realized when the GPT governing the internal workings of the box is classical probability theory, as depicted in Fig. 1 In such cases, the scope of possibilities for the overall functionality of the common-cause box can be characterized as follows. The systems A and B are described by classical variables, Λ A and Λ B (here assumed to be discrete). Classically, the composite system AB is prepared in a joint distribution over these, Without loss of generality, we can take systems A and B to be perfectly correlated (by incorporating any noise into the measurements), corresponding to the case where P Λ A Λ B (λ A , λ B ) = λ δ λ A ,λ δ λ B ,λ P Λ (λ) for some distribution P Λ (λ), and where δ denotes the Kroneckerdelta function. This distribution over Λ A and Λ B can be conceptualized as follows: sample a variable Λ from some distribution, then let Λ A and Λ B be copies of it.
Classically, the X = x outcome of the S = s measurement on system A is modelled by a conditional probability distribution P X|SΛ A . The GPT effect associated to this measurement on A is r A x|s with com- Similarly, the GPT effect associated to the measurement on B is r B y|t and has components [r B y|t ] λ B = P Y |T Λ B (y|tλ B ). Substituting these expressions into Eq. (1), we con-clude that a classical common-cause box satisfies This is recognized to be the expression for a conditional probability distribution P XY |ST that satisfies the Bell inequalities.

The free operations
The set of free operations defining the resource theory in our approach are those that can be achieved by embedding the resource in a circuit composed of box-type processes that are classical and that respect the causal structure of the scenario. The stipulation that the process respects the causal structure is required for it to remain within the enveloping set of processes in the resource theory. Because Bell scenarios are the ones of interest to us in this article, the set of free operations are those that can be achieved by embedding the resource in a circuit composed of box-type processes that are classical and that have the causal structure of a Bell scenario, namely, a common-cause acting on all of the wings. 16 The most general free operation taking a bipartite common-cause box with settings S, T and outcomes X, Y to a bipartite common-cause box with settings S , T and outcomes X , Y is depicted in blue in Fig. 2. It is the most general processing which makes use of a classical common cause that can act on the local pre-processings and the local post-processings at each of the wings. It subsumes as special cases processings wherein classical common causes act on any of the subsets of these four local processings.
Note that the most general free operation allows arbitrary feed-forward of classical information on each wing, since this does not require any causal influences between the wings. 17 But any such op- 16 In particular, any operation involving a cause-effect relation between the wings is excluded from the free set. 17 Because the only physical restriction we are imagining is that no cause-effect influences are present between wings, feedforward of nonclassical information (that is, of arbitrary GPT systems) at each wing is also a free LOSR operation. Without loss of generality, however, we consider only feed-forward of classical systems in this work, because this is already sufficient to generate any conditional probability distribution P X Y ST |XY S T consistent with the causal structure, i.e., satisfying Eqs. (7)(8).
eration can always also be put into the canonical form depicted in blue in Fig. 3. It suffices to note that the system that mediates the action of the common cause on the post-processings on a given wing can always be passed down the classical side-channel. Henceforth, we use this canonical form when describing the most general free operation. Formally, such an operation transforms the conditional probability distribution P XY |ST to P X Y |S T as where the conditional probability distribution P X Y ST |XY S T satisfies certain constraints, which we specify below.
Circuit fragments that map processes to processes (such as the ones depicted in blue in Figs. 2 and 3) have been studied extensively in recent years in a variety of frameworks, most notably the quantum combs framework of Refs. [49,50], and the process matrix framework of Refs. [57,58]. If the source and target resources are denoted by R and R , respectively, and the free operation is denoted by τ , we represent Eq. (3) as where • is a particular instance of the link product of Ref. [49].
On the left wing, the most general local preprocessing takes as input the setting variable of the target resource (S ) and the variable originating from the common cause, and it generates as output the setting variable of the source resource (S) as well as an arbitrary variable which propagates down the side-channel. The most general post-processing on the left wing takes as input the outcome variable of the source resource (X) and the side-channel variable, and it generates as output the outcome variable of the target resource (X ). Included as special cases among these pre-and post-processings are maps from S to S and from X to X that constitute relabellings, coarse-grainings, or fine-grainings of the variable, where the possibilities are constrained by the cardinalities of these variables. Also included as special cases are instances where the map from S to S or the map from X to X is chosen probabilistically, and instances where these two maps are correlated (by making use of the side-channel). The analogous pre-and post-processings at the right wing are also possible. Finally, the choices of maps on the left can also be correlated with the choices of maps on the right, by leveraging the common cause.
Note also that free operations can change the cardinality of a given box, which is reflected in the fact that we have not restricted the cardinalities of X , Y , S , or T in any way. Thus, free operations can change the type of a resource.
The free operations are characterized by those P X Y ST |XY S T which can be achieved via the type of circuit fragment depicted in Fig. 3, namely, those such that for some joint distribution P Λ A Λ B and for some P X S|XS Λ A and P Y T |Y T Λ B satisfying noretrocausation conditions One can directly check that any P X Y ST |XY S T admitting of a decomposition as in Eq. (5) satisfies the operational no-signalling constraints and the operational no-retrocausation conditions The parts of the circuit fragment in Fig. 3 that are associated to P X S|XS Λ A and P Y T |Y T Λ B we refer to as local operations. The part associated to P Λ A Λ B corresponds to a joint distribution on the variables distributed to the two wings and can therefore be conceived of as shared randomness. Consequently, the free operations we are endorsing here can indeed be described as local operations and shared randomness (LOSR), as noted earlier.

Definition 2. An operation is in the set LOSR (and termed an LOSR operation) if and only if it
is associated to a conditional probability distribution P X Y ST |XY S T that admits of the sort of decomposition specified by Eqs. (5) and (6).
Previous resource-theoretic approaches to Bellinequality violations have also endorsed the intuitive notion that local operations supplemented with shared randomness should constitute the free operations. Different works, however, have made different proposals for how this notion ought to be formalized. The correct formalization, in our opinion, is the one provided in Geller and Piani [17] and independently in deVincente [16], which coincides with the one given above 18 . Therefore, in this article we 18 The definition of LOSR given in Geller and Piani [17] is very similar to the one provided here (see Fig. 4 therein), while the one provided in de Vicente [16] is much more cumbersome.
are endorsing the proposal of Refs. [16,17] to take LOSR as the free operations. On the other hand, Refs. [18,20,21] have formalized the notion of local operations supplemented with shared randomness differently, defining a strict subset of the set LOSR defined above (a subset that can be shown to be nonconvex). Nonetheless, we believe that this discrepancy was an oversight and that it is unlikely anyone would defend taking this subset rather the full set to define the resource theory. We discuss the issue in depth in Appendix A.2. As a final comment, note that, without loss of generality, we can take the joint distribution to be for some distribution P Λ , and hence express Eq. (5) as As a consequence, the conditional probability distribution P X Y ST |XY S T can be conceptualized as the more familiar object PXỸ |ST for setting variables S,T and outcome variablesX,Ỹ that are defined as follows. We take the composite of the outputs of the circuit fragment on the left wing, X and S, as a composite outcome variableX, so thatX := (X , S). Similarly, we take the composite of the inputs on the left wing, X and S , as a composite setting variablẽ S, so thatS := (X, S ). Making the analogous definitions forỸ andT in terms of Y, T, Y , T on the right wing, Eq. (9) can be rewritten as Recalling Eq. (2), it is clear that PXỸ |ST satisfies all of the Bell inequalities. This illustrates the consistency of our proposal for the free operations, for we have just shown that the free operations on a resource P XY |ST are those that are achieved by taking a link product [49]

Locally deterministic operations and local symmetry operations
It is valuable to consider two special finitecardinality subsets of LOSR operations: those that are deterministic and those that are invertible. Note that the invertible LOSR operations are included among the deterministic ones because any indeterminism in the operation would be an obstacle to invertibility.

it is a locally deterministic operation) if and only if the conditional probabilities
Deterministic LOSR operations-i.e., LDO operations-factorize in the sense that every LDO operation can be expressed as the product of two local deterministic operations such that This follows from the fact that the deterministic dependences preclude any dependence on the shared random variables λ A and λ B in Eq. (5), which then reduces to Eq. (11). Furthermore, the no retrocausation assumption of Eq. (8) implies that these deterministic dependencies are of the following form: for some functions f A , g A , f B and g B . Specifically, on the left wing, S is generated as deterministically as a function of S (the pre-processing) and X is generated deterministically as a function of X and S (the post-processing, which is setting-dependent), and similarly for the right wing. A generic bipartite locally deterministic operation is depicted in Fig. 4. The cardinality of the set LDO for a given type can be easily deduced. Let |S|, |X|, . . . denote the cardinalities of the variables S, X, . . . . The total number of possibilities for the function g A is |X | |X|·|S | , and the total number of possibilities for the function f A is |S| |S | , so that the total number of possibilities for a deterministic operation on the left wing is |S| · |X | |X| |S | . An analogous decomposition holds for the deterministic operations on the right wing, and the total number of possibilities for these is |T | · |Y | |Y | |T | . Consequently, the cardinality of the set LDO in this bipartite case is The other important subset of LOSR are those type-preserving operations which are invertible (and hence also deterministic). We refer to this subset of LOSR operations as the local symmetry operations and denote it LSO. Every local symmetry operation, P sym X Y ST |XY S T , has the form of a locally deterministic operation, P det X Y ST |XY S T , specified in Eqs. (11)- (12). That is, where but where f A , g A are such that P sym X S|XS defines an invertible map from (X, S ) to (X , S), and where f B and g B are such that P sym Y T |Y T defines an invertible map from (Y, T ) to (Y , T ). Unlike general LDO operations, LSO operations are always type-preserving, and hence the type ( |X | |Y | |S | |T | ) always matches the type ( |X| |Y | |S| |T | ) . Note that an exchange of the parties is a symmetry operation (i.e., invertible), but it cannot be implemented by local operations, and so it is not part of LSO.
As a final remark, notice that the set of LSO operations forms a group. This follows from the fact that the properties of being deterministic and invertible persist under composition, and that the inverse of every LSO operation is in LSO. This group is generated by the permutations of the value of a setting variable, and the permutations of the value of an outcome variable, where the choice of the latter permutation might depend also on the value of the setting variable on the same wing.
In the bipartite case, the LSO group is a finite group of order 19 (16) corresponding to the (|S|!) relabelings for the settings of the left wing, multiplied by the (|X|!) relabelings of outcomes for each of the |S| different settings, and similarly for the right wing. The group can be generated by the relabelings of only adjacent settings or outcomes, and hence the LSO group admits a natural representation in terms of (|S|−1) + |S|(|X|−1) For a concrete example, consider the operations transforming type ( 2 2 2 2 ) into type ( 2 2 2 2 ) . Throughout this work, we index the values a variable X can take as x ∈ {0,..., |X| − 1}. Accordingly, in the ( 2 2 2 2 ) scenario, X, Y, S, T take values in {0, 1}. Using this notation, the group of LSO can be generated explicitly by the four operations which interconvert where ⊕ denotes summation modulo two. 20 One can readily verify [60] that the order of this group is 64.
Suppose that a resource R is represented as a real-valued vector R of conditional probabilities P XY |ST (xy|st), or any linear transformation thereof 19 The order of a group is the cardinality of the set of group elements, i.e., the order of the LSO group quantifies the total number of invertible LDO operations. 20 A second generating set of operations for this group is given by τ 1 ,..., τ 6 defined in Proposition 11.
(such as the representation in terms of correlators used in Section 6). LSO operations act as invertible linear maps on such a representation. Assuming f is a linear function over R, then its action can be represented as f ( R) = f · R for some f . Hence, it is equally as meaningful to speak about f being transformed under LSO group elements as it is to speak about R being so transformed. The action of an LSO operation on f can be thought of as applying the inverse transformation to R, i.e., Note that many type-changing LOSR operations are equally well-defined as transformations on linear functions. The critical requirement is that the operation be left-invertible, i.e., it should act as an injective function on the set of conditional probabilities. See Refs. [59,61,62] for discussions on the topic of converting linear functions (and Bell inequalities in particular).

Convexity of the set of free operations
We now show that the set of free operations is convex, and that the extremal elements are deterministic, and enumerable for fixed type of the source resource and of the target resource. This implies that the set of free operations mapping from a given source resource type to a given target resource type is a polytope. We begin by proving convexity.
This follows from the fact that the resources required to achieve such a mixing are achievable using LOSR. Suppose β is a binary variable that decides whether τ 0 or τ 1 will be implemented. It suffices to imagine that β is sampled from a distribution P β where P β (0) = w, that it is copied and distributed to both wings (with a copy sent down the side-channel at each wing), and that the local processings that are implemented on each wing are made to depend on β (chosen so that if β = b, then τ b is implemented overall). Because β can be incorporated into the definition of the shared randomness, the procedure just described is itself achievable using LOSR.
The convexity of the set of LOSR operations is crucial for the technique we develop in the next section to answer questions about resource conversion. Recognizing the full potential of this convexity is one of the key contributions of our work. In Appendix A.2, we discuss convexity further, in particular noting that previous formulations of LOSR did not seem to recognize the physical realizability of convex mixing within LOSR, but rather imposed convexity mathematically.
Next, we highlight features of the extremal free operations. This proposition is a minor generalization of Fine's argument, since the latter states that locally deterministic models can generate any conditional distribution that arises in a locally indeterministic model [63]. As in Fine's argument, here too any indeterminism in the local operations can be absorbed into the shared randomness, and hence allowing indeterministic local operations provides no more generality than considering only deterministic local operations.
Proof. It suffices to run Fine's argument for the composite variablesS,T,X andỸ . To see this explicitly, note that the constituent factors in the expression for an LOSR operation in Eq. (10) can be rewritten as where for each value of λ A , the conditional P det,λ Ã X|S describes a deterministic operation on the left wing specifying the value ofX = (X , S) for every value ofS = (X, S ), and similarly for P det,λ B Y |T on the right wing. Plugging these back into Eq. (10), we have that Eq. (18) shows that a generic indeterministic LOSR operation can always be decomposed into a convex combination of products of deterministic operations on each wing, and hence the convexly extremal LOSR operations are precisely the LDO operations.
What we have shown above is that any element of LOSR admits of a convex decomposition into el- . This implies the following useful geometric fact: The set of all free operations of a given type is a polytope whose vertices are the locally deterministic operations of that type, The number of vertices of this polytope corresponds to the cardinality of the set of LDO operations, as given in Eq. (13).

Resource theory preliminaries
A central question in any resource theory is whether one resource can be converted to another via the free operations. Many notions of conversion are studied: single-copy deterministic conversion, single-copy indeterministic conversion (where the probability of success need only be nonzero), multi-copy conversion (where one is given more than one copy of the resource), asymptotic conversion (where one is given arbitrarily many copies), and catalytic conversion (where one has access to another resource that must be returned intact after the conversion). We here focus on single-copy deterministic conversion.
As noted earlier, we denote the application of an operation τ to a resource R by τ • R. If R 1 can be converted to R 2 by free operations, one writes If one can determine, for any pair of resources R 1 and R 2 , whether R 1 can be converted to R 2 using a free operation, then one can determine the pre-order over all resources that is induced by the conversion relation. A pre-order, by definition, is a transitive and reflexive binary relation between resources. The conversion relation is reflexive because the identity operation is free and maps a resource to itself, while it is transitive because if R 1 −→ R 2 and There are four possible ordering relations that might hold between a pair of resources.
If R 1 is either strictly above or strictly below R 2 , we say that R 1 and R 2 are strictly ordered.
We pause to comment on the notion of equivalence of resources. By definition, if R 1 is equivalent to R 2 then the conversion from one to the other is free in both directions, It need not be the case, however, that either of the free operations τ 1 or τ 2 is invertible, nor that one is the inverse of the other. For instance, if R 1 and R 2 are both free resources, then τ 1 can be the operation which discards R 2 and prepares R 1 , while τ 2 can be the operation which discards R 1 and prepares R 2 .
The conversion relation between resources implies a corresponding conversion relation between equivalence classes of resources (relative to the equivalence relation defined above), wherein for any two equivalence classes, they are either strictly ordered or incomparable. The conversion relation between equivalence classes is therefore antisymmetric and describes a partial order relation rather than a pre-order relation. One can therefore conceptualize the project of characterizing the pre-order as a characterization of the equivalence classes and of the partial order that holds among these. In this work, we do not provide a characterization of the equivalence classes, and so our focus will be on directly characterizing features of the pre-order of resources.

Global features of a pre-order
To have a complete understanding of deterministic single-copy conversion in a resource theory, one must have an understanding of the pre-order that this conversion relation defines. In this section, we describe some of the basic features that characterize pre-orders.
Perhaps the most basic question about a pre-order of resources is whether or not it is totally preordered, meaning that every pair of elements in the pre-order is strictly ordered or equivalent (i.e., the pre-order has no incomparable elements). Equivalently, we say that a pre-order is totally pre-ordered if and only if the partial order over equivalence classes that it defines is totally ordered (i.e., has no incomparable elements).
If there do exist incomparable resources, one can ask if the binary relation of incomparability is transitive, in which case the pre-order is termed weak.
A chain is a subset of the pre-order in which every pair of elements is strictly ordered. The height of a pre-order is the cardinality of the largest chain contained therein. An antichain is a subset of the pre-order in which every pair of elements is incomparable. The width of a pre-order is the cardinality of the largest antichain contained therein.
Other important properties of the pre-order refer to the interval between a pair of resources, where R is in the interval of R 1 and R 2 if and only if both R 1 −→ R and R −→ R 2 . If the number of equivalence classes which lie in the interval between a pair of resources is finite for every pair of inequivalent resources, then the pre-order is said to be locally finite, otherwise it is said to be locally infinite.

Features of resource monotones
A resource monotone is a function over resources whose value cannot increase under any free operation in the resource theory. Formally,

Definition 8. A function M from resources to the reals is called a resource monotone if and only if
In other words, a resource monotone is an orderpreserving map from the pre-order of resources to the total order of real numbers. Whenever some monotone M and a pair of resources R 1 and R 2 satisfies M (R 1 ) < M (R 2 ), we will say that the monotone M witnesses the fact that R 1 −→ R 2 .
If the pre-order is not totally pre-ordered (i.e., if there exist incomparable resources), then no single monotone can completely characterize the pre-order. A complete characterization may be achieved, however, by a family of monotones. Specifically, a family of monotones {M i } i is said to be complete if it completely characterizes the pre-order, that is, if A complete set of monotones is therefore an alternative way of describing the pre-order. Strictly speaking, monotones should be functions from resources of any type in the resource theory to the reals. However, many natural functions are only defined for particular types of resources. For instance, the function P XY |ST (00|00)P XY |ST (11|01) + P XY |ST (20|02) is only defined for common-cause boxes where the cardinalities of X and T are three. To accommodate this, we define the notion of a monotone relative to a set S: M is a monotone relative to a set S of resources if and only if for all If S is any set of resources all of which are of a particular type, a monotone relative to S is said to be type-specific.

Monotone constructions for any resource theory
Here we review a variety of approaches to constructing resource monotones. We will make use of these versatile constructions to define an especially useful pair of monotones for the resource theory of common-cause boxes in Section 6.

Cost and yield monotones
It is possible to upgrade a type-specific monotone to a type-independent monotone using either a cost construction or a yield construction. In fact, a cost or yield construction takes any function (monotone or not) together with a set of resources and induces a type-independent monotone from it, as follows.
Given any function f which maps some set S of resources to real numbers, one can define associated monotones which are applicable to all resources, as follows: If there does not exist any R ∈ S such that R −→ R , then the yield is defined to be −∞. Similarly, if there does not exist any R ∈ S such that R −→ R, then the cost is defined as ∞ [64].
In words, M [f -yield, S] is a monotone which asks for the most valuable resource in the set S (as measured by the function f ) that one can create from the given resource R. 21 is a monotone which asks for the least valuable resource in the set S (as measured by the function f ) that one can use to create the given resource R. Note that in both cases, many different functions may yield the same monotone, so there is a conventional element to one's choice of function. Note also that S may be restricted to resources of a particular type (in which case f need only be defined on resources of that type), and yet the type of the resource R for which the monotones may be evaluated is unrestricted. 21 The maximum of a function f over the set of boxes to which R can be converted can also be thought of as the performance of R over the so-called 'nonlocal game' defined by the 'payoff function' f . Since the set of boxes to which R can be converted (of any given type) is a polytope, it follows that all forbidden conversions (those from R to a resource outside the polytope) can be witnessed by a suitable set of payoff functions, namely, whatever linear functions pick out the facets of R's polytope (for any given target type). In other words, any resource outside the polytope will attain a higher value on at least one of these functions. It follows, then, that the set of yield monotones induced by all possible linear functions constitutes a complete set of monotones. While this observation may not be useful in practice, it does pose an interesting contrast with the findings of Ref. [65]: For common-cause boxes, we find that 'nonlocal games' constitute a complete set of monotones; whereas [65] shows that for the resource theory of quantum states under LOSR it is semiquantum games instead of nonlocal games that form a complete set of monotones.

Weight and robustness monotones
Various functions have been used as measures of the distance of a resource from the set of classical common-cause boxes in previous work [16,17,22,[66][67][68][69]. In what follows, we highlight some of these which are monotones in our resource theory. The nonlocal fraction, which we denote here by M NF , is the minimum weight of the nonfree fraction in any convex decomposition of the resource, The nonlocal fraction was proven to be a resource monotone relative to (a superset of) the LOSR free operations in Ref. [16,Sec. 5.2], though it is there termed the 'EPR2' measure. Next, there is the case of robustness measures 22 which quantify the minimum weight of a resource from some particular class that must be added convexly with the original resource for the mixture to be free. The two robustness measures that we consider differ by the class of resources that are mixed with the original resource. The first, which we denote by M RBST,L (R), considers mixing the original resource R with any element in the set L [R] of free resources of the same type: This robustness measure was shown to be a resource monotone relative to LOSR in Ref. [17,Sec. 3]. The second robustness measure, which we denote simply by M RBST (R) considers mixing the original resource R with any element in the set S [R] of all resources of the same type: The unified resource theory formalism of Ref. [64] implies that all three of these distance measures are resource monotones in any resource theory wherein 22 Note that in Ref. [69] these were termed 'visibilities'. the free operations act convexly 23 , including our resource theory here. Additionally, in Corollary 13, we show that each of these three distance measures can be explicitly related to a monotone for which we provide a closed-form expression relative to ( 2 2 2 2 ) -type resources. By extension, we therefore also provide closed-form expressions for these three distance measures relative to ( 2 2 2 2 ) -type resources.

A linear program for determining the ordering of any pair of resources
Next, we provide a linear program which allows one to determine the ordering relation that holds between any two resources in our enveloping theory.
To do so, it is convenient to set up some useful notation.

Definition 9.
Let the bold symbol S refer to any set of resources. We use subscripts to specify the type of the resources in the set, such as S ( |X| |Y | |S| |T | ) or S [R] . We use superscripts to specify further properties of a set. For example, the set of all GPT-realizable common-cause boxes is denoted by S G , the set of all nonfree resources is denoted by S nonfree , and the set of all free resources is denoted by S free . Whenever we wish to emphasize that a specific set is discrete, we denote it V, and whenever we wish to emphasize that a specific set is a polytope, we denote it P.
. From Propositions 5 and 7, and the finite cardinality is a convex set with a finite number of vertices, and hence is a polytope: Proposition 10 (The polytope of resources obtainable from a given resource by LOSR). The set of all resources of type [R 2 ] obtainable from 23 An operation τ acts convexly if the image τ • (R 3 ) is a given mixture of τ • (R 1 ) and τ • (R 2 ) whenever the preimage R 3 is the same mixture of R 1 and R 2 . All linear operations act convexly. Convex action should not be confused with convex closure of the operations, which was the subject of Proposition 5. R 1 by LOSR forms a polytope, We can express the content of Proposition 10 equivalently as Therefore, to determine whether R 1 is higher than R 2 in the pre-order of resources, it suffices to implement the following computational test: To determine which of the four possible ordering relations holds for a given pair of resources, R 1 and R 2 , it suffices to determine whether R 1 −→ R 2 or not and whether R 2 −→ R 1 or not. This requires just two instances of the linear program. 24 According to Proposition 10, the image of a resource under the set of all LOSR free operations is equivalent to the convex closure of the image of the resource under only the extremal operations. Replacing the set of all operations with only the extremal ones is a dramatic shortcut.
In principle, the linear program just described allows one to characterize the pre-order completely. For instance, this linear program defines a complete set of monotones for a given set of resources S, namely, {M R : R ∈ S} where the monotone M R is defined as follows: for all R ∈ S, M R (R) = 1 if R → R by LOSR and M R (R) = 0 otherwise. M R (R) reports the answer returned by the linear program for the question of whether R → R by LOSR, and if one has the answer for all R ∈ S, then one has located R within the pre-order. However, such a brute-force characterization of the pre-order requires one to apply the linear program to every pair of resources, which is not possible in practice. Rather, the linear program is primarily useful for answering questions about conversions among pairs (or finite sets) of resources. 24 In the language of Ref. [70], these linear programs constitute a complete witness for conversion.
To characterize the full pre-order more generally, one would ideally have a finite set of resource monotones that characterize the pre-order completely. Furthermore, in order to determine certain global properties of the pre-order, such as those described earlier, knowledge of a few carefully chosen resource monotones will typically suffice. This is the strategy we will adopt hereafter in the article. Specifically, over the next few sections, we define a pair of resource monotones and use these to prove that the pre-order of single-copy deterministic conversion is not totally pre-ordered (i.e., there exist incomparable resources), that it is not weak (the incomparability relation is not transitive), that it has both infinite width and infinite height, and that it is locally infinite.

Two useful monotones
We will define two monotones, one a cost construction and the other a yield construction, where the sets of resources relative to which these costs and yields are evaluated (to be described below) contain only resources of type ( 2 2 2 2 ) . It is useful to first review some facts about the set of all common-cause boxes of type ( 2 2 2 2 ) , that is, about S G ( 2 2 2 2 ) .

Preliminary facts regarding CHSH inequalities and PR boxes
We adopt the convention of Ref. [71] of parametrizing common-cause boxes of type-( 2 2 2 2 ) in terms of outcome biases and two-point correlators. The outcome biases are Recalling that the set of common-cause boxes coincides with the set of no-signalling boxes in the Bell scenario, S G ( 2 2 2 2 ) constitutes what is conventionally re-ferred to as the "no-signalling" set for this type. 25 This set is well-known to be a polytope defined by 16 positivity inequalities [67,72]. The set of classical (free) resources of type ( 2 2 2 2 ) is a subset therein, conventionally termed the "local set", and is defined by the same 16 positivity inequalities together with eight additional facet-defining Bell inequalities, namely, the canonical CHSH inequality and its seven variants [73]. A resource is therefore nonclassical (nonfree) if and only if it violates a facet-defining Bell inequality.
The eight variants of the canonical CHSH function are The canonical CHSH function is CHSH 0 , which we will sometimes denote simply as CHSH.
In terms of these, the eight facet-defining Bell inequalities are Note that the regions defined by strict violation of each of the eight inequalities are nonoverlapping [67]. It follows that one and only one of the eight CHSH inequalities can be violated by a given resource, i.e., for nonfree R there is precisely one value of k such that CHSH k (R) > 2.
There are eight extremal nonfree vertices of the full polytope S G ( 2 2 2 2 ) . One of these is the canonical PR box [51,74], denoted R PR and defined explicitly in Table 2; the other seven are variants of this PR box. For each k, we denote the associated variant of the PR-box by R PR,k (so that the canonical PR box is associated to k = 0, R PR = R PR,0 ). R PR,k is the unique resource that maximally violates the kth CHSH inequality, i.e., that achieves its algebraic maximum, CHSH k (R PR,k ) = 4.
Unsurprisingly, the variants of the facet-defining Bell inequalities are interconvertible under LSO operations, as are the variants of the extremal vertices. To illustrate this, it is convenient to factorize the ( 2 2 2 2 ) 25 However, as noted in Appendix A.3, for causal structures different from the Bell scenario, the set S of processes that can be realized by a GPT causal model on the causal structure is typically distinct from the no-signalling set.
LSO group into a subgroup which stabilizes CHSH 0 and a subgroup which does not, as follows.
Proposition 11. Consider the following invertible operations, i.e., elements of the LSO group for ( 2 2 2 2 ) -type resources: Proof. The first two claims in Proposition 11 are readily verified by standard group theory algorithms [60]. The latter two claims become selfevident by explicitly examining the actions of the operations on expectation values (and hence, their action on resources or functions on resources), per Table 1.
In light of Table 1, the third claim is easily verified. The fourth claim simply captures the fact that the eight CHSH functions are related by LSO, and similarly the eight PR boxes are also interconvertible under LSO. We can explicitly show how the interconversions are accomplished by G 123 by describing the actions of {τ 1 , τ 2 , τ 3 } as permutations on the ordered set of CHSH functions, or equivalently, on the ordered set of PR boxes. • τ 1 flips the sign of every correlator, so the action of τ 1 on the ordered set of CHSH functions is the permutation (0, 4)(1, 5)(2, 6)(3, 7). • τ 2 exchanges the roles of A 0 and A 1 , so the action of τ 2 on the ordered set of CHSH functions is the permutation (0, 1)(2, 3)(4, 5)(6, 7). • τ 3 exchanges the roles of B 0 and B 1 , so the action of τ 3 on the ordered set of CHSH functions is the permutation (0, 2)(1, 3)(4, 6)(5, 7). Therefore the orbit of CHSH k under G 123 is easily checked to be {CHSH 0 , ..., CHSH 7 }, as claimed.
The ordered set of PR boxes transforms under LSO operations in exactly the same manner as the ordered set of CHSH functions, since the values of the marginals and the correlators for resource R PR,k coincide with the coefficients of the associated terms in the linear function CHSH k (compare, e.g., the expression for CHSH 0 in Eq. (31) with the values of the marginals and correlators for R PR in Table (2)). Hence, the argument just given also establishes that the orbit of R PR,k under G 123 is {R PR,0 , ..., R PR,7 }.

Defining the two useful monotones
Monotone 1: The yield of a resource with respect to the set of resources of type ( 2 2 2 2 ) , as measured by the CHSH function.
To define our first monotone, consider the canonical CHSH function The CHSH function is type-specific 26 and furthermore is not a monotone [16]. However, we can apply the prescription of Eq. (23) to this function, taking the set S to be S G ( 2 2 2 2 ) , i.e., the set of all commoncause boxes of type ( 2 2 2 2 ) . Doing so, we define the following (type-independent) yield-based monotone, which we will denote by M CHSH : Note that one can always find some R ∈ S G ( 2 2 2 2 ) such that R −→ R regardless of the type or details 26 The CHSH function is well-defined only for resources of type ( 2 2 2 2 ).
of R, simply because free resources of type ( 2 2 2 2 ) may always be freely generated after discarding R. Hence, the value of this monotone is never less than 2, which is the maximum of the CHSH function when applied to the subset of free resources.
If one applies this procedure to any of the eight variants of the CHSH functions in Eq. (31), the monotones one thereby obtains all turn out to be equivalent to M CHSH . This follows from the fact that all variants of the CHSH function are interconvertible under LSO and therefore the maximum of any one in an optimiziation over all LOSR operations is the same as any other, as noted in Proposition 11d. Monotone 2: The cost of a resource with respect to a set of noisy PR box resources, as measured by the CHSH function.
Our second monotone also involves optimizing the CHSH function, but it is a cost-based monotone, and the set of resources over which one optimizes is restricted to a particular one-parameter family of resources of type ( 2 2 2 2 ) (rather than the full set S G ( 2 2 2 2 ) ). To define this family, we need to highlight a particular resource in the free set, which we denote L b NPR . 27 L b NPR can be defined as the uniform mixture of the PR box with the maximally mixed resource L ∅ (defined in Table 2), namely L b NPR = 1 2 R PR + 1 2 L ∅ , as enumerated in Table 2. The superscript b in the notation L b NPR denotes the fact that this resource sits on the boundary of the free set, namely, that it saturates the canonical CHSH inequality, CHSH(L b NPR ) = 2. The one-parameter family of resources defining our cost construction are the convex mixtures of R PR and L b NPR . We denote the set of these by C NPR . 27 The use of L instead of R when describing the resource L b NPR is a nod to the conventional terminology wherein the classical common-cause boxes are often called the local boxes. See the discussion in the introduction for why we explicitly avoid the local-nonlocal terminology here. Formally, where We use "C" because the set of resources forms a chain (defined in Section 4.1) and "NPR" because each resource in the chain is a noisy version of the PR box. Geometrically, the chain C NPR describes a line segment of resources with endpoints R PR and L b NPR , and α parametrizes the distance from C(α) to L b NPR (the bottom of the chain). To see that the elements of C NPR do indeed form a chain in the partial order, it suffices to note that one can move downwards (decreasing α) starting from any C(α) by mixing C(α) with L b NPR , but one cannot move upwards (increasing α) from any C(α), as doing so would require increasing the value of the monotone M CHSH . Table 2 provides an explicit characterization of a generic resource on the chain, as well as its endpoints and the maximally-mixed free resource.
Using this one-parameter family of resources, we define the following cost-based monotone, which we denote M NPR , where if for some R there is no R ∈ C NPR such that R −→ R, then we define M NPR = ∞. Critically, note that the CHSH function is an injective (one-to-one) mapping from points on the line segment C NPR to the real numbers, with Thus, the problem of minimizing the CHSH function over R ∈ C NPR such that R −→ R is exactly the same as minimizing the function 2α+2 under the constraint C(α) −→ R, that is, For each variant R PR,k of the PR box, where k ∈ {0, . . . , 7}, we can define the chain of noisy versions thereof, that is, One can of course define a cost-based monotone for each such chain. However, all eight of these chains define the same monotone, because the local symmetry operations allow one to move among these, as a consequence of Proposition 11d and the fact that L ∅ is stable under all ( 2 2 2 2 ) -type local symmetry operations. 28 6.3 Closed-form expressions for M CHSH and M NPR for ( 2 2 2 2 ) -type resources The definitions of M CHSH and M NPR both involve an optimization over a continuous set of states. In this section, we derive closed-form expressions for these monotones for resources of type ( 2 2 2 2 ) . Consider first M CHSH .
Equivalently, each function CHSH k is a monotone relative to the subset of ( 2 2 2 2 ) -type resources for which CHSH k (R) ≥ 2.
Proof. We already noted in Section 6 that M CHSH (R) = 2 for all resources R that are free, so it suffices to consider the case of nonfree resources. As noted above, the fact that there is precisely one value of k such that CHSH k (R) > 2 for a nonfree resource R follows from the results in Ref. [67]. Thus, we must show that M CHSH (R) = CHSH k (R) for this value of k.
To prove this, we invoke Theorem 2.2 of Ref. [67], which informs us that every resource R which violates the kth CHSH inequality admits a convex decomposition in terms of the kth variant of the PR box and some free resource that saturates the kth CHSH inequality, Further, λ is specified uniquely by the linearity of the CHSH 28 As an aside, note that, unlike the cost with respect to the chain C NPR , Eq. (36), the cost with respect to the set S G ( 2 2 2 2 ) of all resources of type ( 2 2 2 2 ), as measured by the CHSH function, is utterly uninformative with regards to distinguishing the elements of S G ( 2 2 2 2 ) . This is because the resource R PR,4 can be converted to any other ( 2 2 2 2 )-type resource, and yet CHSH(R PR,4 ) = −4, the algebraic minimum of the canonical CHSH function. Consequently, the value of this CHSH-cost with respect to the set of all resources of type ( 2 2 2 2 ) is −4. Since this monotone is constant on all resources in the scenario, it is completely uninformative. functions and the fact that CHSH k (R PR,k ) = 4 and CHSH k (L b k ) = 2, which together imply that CHSH k (R) = CHSH k λ R PR,k + (1−λ)L b k = 4λ + 2(1−λ). Again leveraging this unique decomposition together with linearity of the CHSH k function and the linearity of LOSR transformations, it follows that for any LOSR operation τ , we have CHSH k (τ . Clearly CHSH(τ • R PR,k ) ≤ 4, since four is the algebraic maximum of the CHSH k function, and CHSH k (τ • L b k ) ≤ 2, since every LOSR operation takes a free resource L b k to a free resource L k , for which CHSH k (L k ) ≤ 2. For R such that CHSH k (R) > 2, then, it follows that free operations on R cannot increase its CHSH k value, and hence the maximum in Eq. (33) is achieved by R itself. This proves Eq. (39).
Using the closed-form expression for M CHSH , we can additionally provide closed-form expressions for the weight and robustness monotones introduced in Section 4.3.2 for ( 2 2 2 2 ) -type resources: Corollary 13. For resources of type ( 2 2 2 2 ) , the nonlocal fraction and the robustnesses to mixing are related to M CHSH as follows: Proof. The relationship of these distance measures to the extent by which the CHSH inequality is violated was derived in Appendix E of Ref. [69]. We simply recast those results in terms of M CHSH (R) instead of CHSH(R) by means of Proposition 12.
The values of the four monotones M CHSH (R), M NF (R), M RBST,L (R), and M RBST (R) are therefore all expressible as strictly-increasing functions of one another when applied to resources of type ( 2 2 2 2 ) . That is, if any one of these monotones increases (respectively decreases) between a given pair of resources of type ( 2 2 2 2 ) , then all of monotones will similarly increase (respectively decrease) between that pair of resources. As we will focus on the ( 2 2 2 2 ) type below, and the three distance-function monotones are no more informative than M CHSH in this case, we will not discuss them further.
We now turn to providing a closed-form expression for M NPR for resources of type ( 2 2 2 2 ) . We first recall some more details of the geometry of S G ( 2 2 2 2 ) . Recall that we use the superscript b to denote that a resource lies on the particular boundary of the free set that is defined by the CHSH inequality (and thus that it saturates this inequality). We further use the superscript bb to denote that a resource both saturates the CHSH inequality and additionally lies on the boundary of the full polytope of resources, S G ( 2 2 2 2 ) . The set L b k of CHSH k -inequalitysaturating resources is 7-dimensional, and the set L bb k of CHSH k -inequality-saturating resources on the boundary of the full polytope S G ( 2 2 2 2 ) is 6dimensional. 29 It follows that L bb k ⊆ L b k . CHSH k (R) > 2. Within this region, if R ∈ C NPR,k , then we have simply M NPR (R) = CHSH k (R). If, on the other hand, R ∈ C NPR,k , we have

Proposition 14. For any free resource R of type
where α is the value appearing in the decomposition R = γ L bb R + (1−γ)C k (α), where C k (α) ∈ C NPR,k , L bb R ∈ L bb k and γ ∈ [0, 1]. This value of α is unambiguous (and computable from simple geometry) because there exists a unique resource L bb R ∈ L bb k and a unique choice of γ ∈ [0, 1] and of α ∈ [0, 1] such that R = γ L bb R + (1−γ)C k (α). The (unique) relevant decomposition is shown in Fig. 5 (for the case where k = 0). The proof of this proposition is given in Appendix B.1.

Properties of the pre-order of common-cause boxes
We now leverage the two monotones just introduced to prove multiple interesting features of the preorder of common cause boxes.

Inferring global properties of the pre-order
Important properties of the pre-order over all resources can already be learned by considering just these two monotones (M CHSH and M NPR ) and just resources of type ( 2 2 2 2 ) , indeed, just a specific kind of two-parameter family of resources within this set. The kind of two-parameter family that we consider, denoted S L bb where with C(α) ∈ C NPR . There are many such families, one for each choice of a resource L bb ∈ L bb . Each such family S L bb is the convex hull of the chain C NPR and the associated point L bb , i.e., Evaluating M NPR for resources in this family is straightforward, thanks to Proposition 14. The proposition directly implies that for any R(α,γ) ∈ S L bb We now consider the value of M CHSH for resources in this family. Noting that CHSH R(α,γ) ≥ 2 for all R(α,γ) ∈ S L bb Recalling that the CHSH function is linear and that it satisfies CHSH(L b ) = 2 for all L b ∈ L b and CHSH(R PR ) = 4, it follows that In Fig. 6(a), we plot some of the level curves 30 for M NPR and M CHSH over any such two-parameter family of resources. The level curve defined by M NPR (R) = 2α+2 is a diagonal line 30 A level curve of a function f is a set of points that yield the same value of f ; e.g., {x | f (x)=c}.
in Fig. 6(a), extending from the (implicit) point C(α) to the point L bb . The level curve defined by M CHSH (R) = 2α(1−γ) + 2 is a horizontal line in Fig. 6(a), extending between the two implicit points C(α) and α R PR + (1−α)L bb . From these level curves, we can immediately deduce a number of features of the pre-order of resources. In particular, we consider those features of the pre-order that were defined in Section 4.1.
First, we see that the pre-order is locally infinite, simply by virtue of the fact that there exist chains which are represented by continuous sets of distinct resources, such as the chain C NPR . The interval between any two resources in such a continuous chain contains a continuous infinity of inequivalent resources.
Second, one can also see that the pre-order of resources is not totally pre-ordered. For instance, the two resources R 1 and R 2 in Fig. 6(a) are incomparable, as witnessed by the fact that R 1 has a larger value of M CHSH than R 2 does, but R 2 has a larger value of M NPR than R 1 does. More generally, the level curves for the two monotones allow one to immediately construct (by inspection) a continuous infinity of such incomparable pairs. Furthermore, the binary relation of incomparability is not transitive, so the partial order is not weak. This can be seen by the example of the three resources in Fig. 6(a): R 1 and R 2 are incomparable (as just argued) and R 3 and R 2 are incomparable (by the same logic), yet R 1 and R 3 are comparable, as evidenced by the fact that one can obtain R 3 from R 1 , by mixing R 1 with any free resource that intersects the line defined by the points R 1 and R 3 .
In addition, one can also see that the height of the pre-order is infinite. It suffices to note that the chain C NPR is totally ordered and contains a continuum of elements. The width of the pre-order is also infinite. Consider, for example, the line segment defined by the points R 1 and R 2 in Fig. 6(a). This subset of resources constitutes an antichain, as every resource in it is incomparable to every other: each resource has a higher M NPR value and lower M CHSH value than any of its neighbors towards the left, and has a lower M NPR value and higher M CHSH value than any of its neighbors towards the right. Because this subset also forms a continuum, it follows that the width of the pre-order is infinite.
Also by inspection, for a given nonfree resource, there are a continuum of chains and antichains which contain it. In order to see this, let us first introduce some terminology. Within the plane of the twoparameter family of resources, depicted in Fig. 7(a), we refer to a direction from a given point R as an "antichain direction" relative to that point, if this direction lies strictly clockwise from the direction defined by the M CHSH level curve that passes through as were introduced in Fig. 6. We consider a particular resource R. In (a), we depict the level curves of M CHSH (horizontal) and M NPR (angled) which include R. By monotonicity of the two monotones, R cannot be freely converted into any resource in the upper light-blue region or in the pair of yellow regions. As we prove in Section 7.2.1, the two monotones are complete for this subset, which is equivalent to the fact that an arbitrary resource R can be freely converted to any resource in the lower dark-blue region; namely, the entire region wherein M CHSH and M NPR do not have a value greater than the one they have on R. Resources in the upper light-blue region can be converted to R, while resources in the pair of yellow regions are incomparable to R.
R and strictly counterclockwise from the direction defined by the M NPR level curve that passes through R. Otherwise, it is called a "chain direction". Thus an antichain direction relative to R is defined by any vector originating in R and terminating at a point strictly within either yellow region in Fig. 7(a), while a chain direction relative to R is defined by any vector originating in R and terminating in either blue region.
A one-dimensional curve of resources in this subset defines a chain (antichain) if and only if at every point on the curve, the tangent to the curve at that point is aimed 31 in a chain direction (antichain direction) relative to that point.
A final lesson we learn from these two monotones is that the set of all monotones induced (via Eq. (33)) by the facet-defining Bell inequalities for a given type do not yield a complete set of monotones for the resources of that type. We have shown that the set of resources is not totally pre-ordered, and as stated in Section 4.3.1, the eight facet-defining Bell inequalities for the ( 2 2 2 2 ) -scenario induce only a single monotone: M CHSH . Since no single monotone can be complete for a pre-order of resources that includes incomparable resources, it follows immediately that the monotones induced by the facet-defining Bell inequalities for the ( 2 2 2 2 ) type are not sufficient for fully characterizing the pre-order of resources of that type. Since such resources trivially can be lifted to any nontrivial Bell scenario (where the lifted resource will violate no facet-defining Bell inequalities other than CHSH), it follows that: Proposition 15. The pre-ordering of resources relative to LOSR operations cannot be resolved solely using the degree of violations of facet-defining Bell inequalities.
Proof. By definition, any complete set of monotones allows one to compute the values of any other monotone from them 32 . However, although the 31 More precisely: a line defines two opposing directions, and both of these directions will point in a chain direction, or both will point in an antichain direction. Proposition 15 shows that the nonclassicality of common-cause processes is not completely characterized by the monotones that are naturally associated to facet-defining Bell functionals, despite the fact that such Bell functionals are sufficient to witness whether or not a resource is nonclassical.

Incompleteness of the two monotones
In this section, we prove that the two-element set of monotones {M CHSH , M NPR } is not a complete set. We do so by showing that it is not complete even for resources of type ( 2 2 2 2 ) . A simple proof is as follows. Consider resources of the form R = 1 2 L bb + 1 2 C(½) for different choices of the CHSH-saturating resource L bb that lies in the boundary of S G ( 2 2 2 2 ) . We will show that there are pairs of resources of this form which are strictly ordered, and other pairs of resources of this form which are incomparable. These facts cannot be captured by the two monotones, which see all resources of this form as equivalent, with M NPR = 3 and M CHSH = 2.5.
Consider for example the resources L bb 1 , L bb 2 , and L bb 3 defined in Table 3. Using the pairwise comparison algorithm described in Section 5, one can verify that the resource 1 2 L bb 1 + 1 2 C(½) is strictly higher in the order than 1 2 L bb 2 + 1 2 C(½), while the two resources 1 2 L bb 2 + 1 2 C(½) and 1 2 L bb 3 + 1 2 C(½) are incomparable. Note that L bb 1 is a convexly extremal resource, while L bb 2 and L bb 3 are not. As an aside, it is worth noting that because the nonlocal fraction and the two standard robustness measures witness exactly the same ordering relations as M CHSH does (as demonstrated in Sec- the argument presented in Section 7.3.

Completeness of the two monotones for certain families of resources
Although M CHSH and M NPR do not form a complete set of monotones for the set of all resources of type ( 2 2 2 2 ) , it turns out that they do form a complete set of monotones for certain subsets thereof. for any L bb ∈ L bb .
Proposition 16 is proven in Appendix B.2. The logic of the proof is quite simple: we prove that there always exists a free operation τ erase−γ which converts an arbitrary resource R(α 1 , γ 1 ) in the family to some resource R(α 2 , 0) lying on the chain C NPR without changing the value of M CHSH . By convexity, it follows that R(α 1 , γ 1 ) can be converted to any resource in the convex hull of R(α 1 , γ 1 ), R(α 2 , 0), L bb , and L b NPR ; namely, the dark-blue region in Fig. 7. This region corresponds to the set of all resources with a lower value of both M CHSH and M NPR . It follows that if a conversion is not forbidden by consideration of this pair of monotones, then it is achievable. By the definition of completeness for a set of monotones (see Eq. (21)), this implies that the two monotones are indeed a complete set for this family of resources.

At least eight independent measures of nonclassicality
In this section, we tackle the question of how many independent continuous monotones are required to fully specify the partial order of resources. This is the content of Theorem 21. Along the way to proving this result, we also prove a powerful result about the equivalence classes under LOSR for nonfree resources of type ( 2 2 2 2 ) , stated in Proposition 18. We begin by drawing a distinction among resources.

Definition 17. A resource is said to be orbital if its equivalence class under type-preserving LOSR is equal to its equivalence class under LSO.
It follows that if all the resources in a set S are orbital, then the quotient space [75] of S under the group LSO provides a representation of the partial  order of LOSR-equivalence classes of resources in S (despite the fact that the LOSR operations do not themselves form a group). 33 This property of resources is pertinent to the discussion here because of the following result: The proof is provided in Appendix B.3. Note that for free resources, LOSR-equivalence is distinct from LSO-equivalence because the LSOequivalence class of any resource (including a free resource) is of finite cardinality, while the LOSRequivalence of a free resource is the entire set of free resources, which is of infinite cardinality. Thus, free resources are not orbital. Moreover, the coincidence between being nonfree and being orbital does not generalize beyond the ( 2 2 2 2 ) scenario. For instance, note that a pair of ( 2 2 2 2 ) resources, R 1 and R 2 , which are implemented in parallel can be conceptualized as a ( 4 4 4 4 ) resource, R 1⊗2 , by composing the two binary setting variables on the left wing into a single 4-valued setting variable on the left wing, and similarly for the other setting variable and the outcome variables. If R 1 is free and R 2 is nonfree, then R 1⊗2 is nonfree, and yet because R 1 's equivalence class is not generated by LSO, neither is the equivalence 33 For practical purposes, Ref. [59,App. B] provides a technical discussion regarding how to efficiently select a representative Bell inequality under a finite symmetry group; the procedure discussed there is equally applicable for the task of efficiently selecting canonical form resources. Note, however, that the LSO symmetry group differs from the Bell-polytope automorphism group considered in Ref. [59], in that LSO does not include the symmetry of exchange-of-parties.
class of R 1⊗2 . Thus, R 1⊗2 is a nonfree resource that is not orbital.
To express the next proposition, we require the following definition.

Proposition 20. For any compact set S of resources that are all orbital, the intrinsic dimension of the set S is a lower bound on the cardinality of a complete set of continuous monotones for S (and for any superset of S).
The proof is provided in Appendix B.4. Recognizing that the set of nonfree resource of type ( 2 2 2 2 ) has intrinsic dimension equal to eight, 34 then Propositions 18 and 20 together imply the following theorem: Theorem 21. For resources of type ( 2 2 2 2 ) , the cardinality of a complete set of continuous monotones is no less than 8. 34 That IntrinsicDim(S nonfree ( 2 2 2 2 ) ) = 8 is evidenced by the characterization of such resources in terms of outcome biases and two-point correlators. If T indicates any type, then IntrinsicDim(S nonfree (think of subtracting one polytope from a circumscribing polytope of the same dimension). See Refs. [59,61,73,76] for discussions on the intrinsic dimension of no-signalling polytopes.

Properties of the pre-order of quantumly realizable common-cause boxes
The bulk of this article has considered the resource theory which is defined by taking the enveloping theory of resources to be the GPT-realizable commoncause boxes, and the free subtheory of resources to be the classically realizable common-cause boxes. In this section, we consider a slightly different resource theory, wherein the enveloping theory of resources is taken to be the common-cause boxes that are realizable in a quantum causal model, which we term quantumly realizable, while the free subtheory is chosen to be, as before, the commoncause boxes that are classically realizable. Effectively, the new resource theory concerns the nonclassicality of common-cause boxes within the scope of nonclassicality that can be achieved quantumly. In other words, it concerns the intrinsic quantumness of common-cause boxes.
Formally, the conditional probability distribution associated to a quantumly realizable common-cause box is of the same form as Eq. (1), that is, but where the vector s AB is a real vector representation of a quantum state on the bipartite system composed of quantum systems A and B, and the sets of vectors {r A x|s } x and {r B y|t } y are real vector representations of POVMs on A and on B respectively. (See, e.g., Ref. [42].) Although the conclusions we drew in Section 7.1 concerned the pre-order of GPT-realizable commoncause boxes, analogous results hold true for the preorder of quantumly realizable common-cause boxes. This is because the kind of two-parameter family of GPT-realizable common-cause boxes that was used to establish global features of the pre-order of such boxes in Section 7.1 contains a two-parameter family of quantumly realizable common-cause boxes that can be used for the same purpose. A caricature of one such quantumly realizable family is provided in Fig. 8. Specifically, if one reviews the arguments that were used in Section 7.1 to establish the various global properties of the pre-order of GPTrealizable common-cause boxes, it becomes apparent that these apply equally well to the quantumly realizable common cause boxes.
It is also straightforward to show that the lower bound on the cardinality of a complete set of monotones, obtained in Section 7.3, also applies to the resource theory of quantumly realizable commoncause boxes. It suffices to consider the case of the quantumly realizable resources of type ( 2 2 2 2 ) , hereafter S Q ( 2 2 2 2 ) , and to note that the set of nonfree resources therein, that is, the set S nonfree , still has intrinsic dimension equal to eight.
In the rest of this section, we consider properties of the pre-order of quantumly realizable commoncause boxes that are particular to the quantum case.
Unlike for the set S G ( 2 2 2 2 ) , where the partial order of equivalence classes has a unique element at the top of the order (the equivalence class there is no unique element at the top of the order. An easy way to see this is by considering the example of the Tsirelson box (R Tsirelson ) and the Hardy box (R Hardy ), each of which is defined explicitly in We show these two resources in Fig. 8(a), together with an approximate sketch 35 of the extremal quantumly realizable resources which interpolate between them (the light-blue curve). The values of M CHSH and M NPR on all of these resources is plotted in Fig. 8(b). From the figure, one can immediately infer that R Tsirelson and R Hardy are incomparable.
Recall that no quantumly realizable resource can achieve the algebraic maximum of M CHSH , while some GPT-realizable (such as R PR ) can achieve the maximum. In contrast to M CHSH , M NPR is such that some quantumly realizable resources (such as R Hardy ) violate it maximally. Furthermore, whereas R PR maximizes both M CHSH and M NPR , no single quantumly realizable resource maximizes both those monotones. Therefore, a unique feature of the enveloping theory of quantumly realizable commoncause boxes is that inequivalent resources can simultaneously be maximally nonclassical (according to distinct monotones), even among ( 2 2 2 2 ) -type resources. 35 An analytic characterization of the set of all extremal quantumly realizable resources within S Q ( 2 2 2 2 ) is not known. In Fig. 8(a), the endpoints and the slope of the curve at the endpoints are exact, and the rest of the curve is merely an interpolation.
The interpolated curve in Figs. 8(a) and 8(b) furthermore suggests that perhaps all extremal quantum-realizable resources depicted therein are relatively incomparable. The following lemma gives a powerful result regarding maximally nonclassical resources: of quantumly realizable resources of type ( 2 2 2 2 ) , then R is at the top of the preorder among quantumly realizable resources of type ( 2 2 2 2 ) .
Proof. Let R ∈ S Q ( 2 2 2 2 ) be nonfree and extremal in . Then, to prove the proposition, we need only prove that any quantumly realizable R ∈ S Q ( 2 2 2 2 ) that can be freely converted to R cannot be higher in the order than R (rather, it must be equivalent). Assume the existence of some quantumly realizable R such that R −→ R. Since R is extremal in the image of R under LOSR, 36 it must be that R is converted to R through extremal operations: that is, through LDO. But as follows from Lemma 30 in Appendix B.3, or as can be explicitly checked, 37 the 36 This is justified as follows: from R −→ R it follows that R ∈ P LOSR [R] (R ), and from the fact that quantumly realizable boxes remain quantumly realizable under LOSR, it follows that P LOSR . Finally, R is by assumption extremal in S Q ( 2 2 2 2 ) ; hence, it is extremal in P LOSR [R] (R ) as well. 37 One can explicitly check that all extremal ( 2 2 2 2 )-type resources image of any ( 2 2 2 2 ) -scenario resource is free under any deterministic operation which is not a symmetry! Put another way, there is no preimage of any nonfree ( 2 2 2 2 ) -scenario resource among ( 2 2 2 2 ) -scenario resources under deterministic nonsymmetry operations. This means that the only τ ∈ LDO ( 2 2 2 2 ) → ( 2 2 2 2 ) such that conceivably τ • R = R are symmetry operations. As such, if R is a nonfree extremal quantumly realizable resource of type ( 2 2 2 2 ) , the only quantumly realizable resources (of the same type) which can be converted to R are symmetries of R. Since resources related by a symmetry operation are in the same equivalence class, there are no ( 2 2 2 2 ) -type quantumly realizable resources strictly above R in the partial order. are at the top of the pre-order of quantumly realizable ( 2 2 2 2 ) resources. The fact that one can find a continuous set of such resources follows from the well-known fact that S Q ( 2 2 2 2 ) are mapped to the free set by any deterministic operation which is not a symmetry, which implies by convexity that all ( 2 2 2 2 )-type resources are also mapped to the free set by these operations.
is not a polytope. By furthermore choosing such a set of extremal resources for which M CHSH takes a distinct value for every resource in the set, one additionally guarantees that no two of these top-ofthe-order resources are in the same equivalence class, and hence each must be incomparable to every other in the set. Refs. [69,73,79,80] provide some explicit as were introduced in Fig. 6. Here, we provide a caricature of some ordering relations among quantumly realizable commoncause boxes within this 2-parameter family. We depict the Tsirelson and Hardy boxes (with scaled-up values of the monotones, but accurate ordering of these values), together with a guess of what the boundary of the set of quantumly realizable resources within this 2-parameter family might be (dotted blue curves). In (b), we also depict the values of the two monotones for the set of convexly extremal, quantumly realizable resources which are self-tested by the tilted Bell inequalities (smooth black curve).
sets of resources satisfying these criteria.
As one concrete example, consider the oneparameter family of quantumly realizable resources which are self-tested by the tilted Bell inequalities. We denote this family by {R Tilt (θ) : θ ∈ (0, π/2]}. The definition of R Tilt (θ) is given in Table 4. These resources are related to a corresponding family of tilted Bell functionals [77,78,81,82], parametrized by β ∈ [0, 2], namely, TiltedCHSH β (R ) = 2 + β, Note that the only value of β for which the maximum value of this function over the quantumly realizable set S Q ( 2 2 2 2 ) coincides with the maximum value over the free set S free ( 2 2 2 2 ) is β = 2. Whenever β < 2, the resource R Tilt (θ) for θ defined implicitly by the equa- is the unique maximizer over of the corresponding tilted Bell functional. Formally, for any R ∈ S Q ( 2 2 2 2 ) distinct from R Tilt (θ). It follows that every resource R Tilt (θ) is convexly extremal in the set of quantumly realizable resources, and its extremality is exposed by the corresponding tilted Bell functional. In fact, every resource in this family is incomparable to every other in the family, as can be shown directly by considering the values of M CHSH and M NPR . In Fig. 8(b), we show a plot of the values of the two monotones evaluated on this family. The points form a continuous antichain, shown in black. Note that the family of resources {R Tilt (θ) : θ ∈ (0, π/2]} does not lie in any plane in the linear space of resources, and as such we do not attempt to plot the family directly (rather we only plot its valuations with respect to the two monotones).

Conclusions and outlook
We have conceptualized Bell experiments as common-cause 'box-type' processes: bipartite or multipartite processes with classical variables as inputs and outputs, the internal causal structure of which is a common-cause acting on all of the wings of the experiment. We have argued in favour of this conceptualization by appeal to the fact that Bell's theorem can be regarded as implying the need for nonclassicality in the causal model that underlies the process. We have begun to quantify the nonclassicality of such common-cause box-type processes by developing a resource theory thereof. We have argued in favour of a particular choice of the free operations for this resource theory, namely, those which can be achieved by embedding the resource into a circuit consisting of box-type processes realizable with a classical common cause, and we have shown that this set is equivalent to the set of local operations and shared randomness.
We have focused here on characterizing the preorder defined by single-copy deterministic conversion of resources under the free operations. We have provided a linear program that decides how any two resources are ordered. By leveraging a pair of functions that we have proven to be monotones, we have also established a number of properties of this preorder, such as the fact that it contains incomparable resources, that it has infinite width and height, that it is locally infinite, and that the incomparability relation is not transitive. Moreover, despite the fact that the values of the facet-defining Bell functionals are necessary and sufficient for witnessing the nonclassicality of a common-cause box, we have shown that they are not sufficient for quantifying the nonclassicality of a common-cause box. In other words, there are aspects of the nonclassicality of such boxes relevant to resource conversions that are not captured by the degree of violation of the facet-defining Bell inequalities. For the particular case of resources with two binary inputs and two binary outputs, we moreover showed that at least eight continuous monotones are required to fully specify the pre-order among resources. We have also derived some interesting facts about the pre-order of resources when one restricts attention to commoncause boxes that can be realized in quantum theory. In particular, we have shown that for quantumly realizable resources of type ( 2 2 2 2 ) , all convexly extremal resources are at the top of the pre-order of such resources, and that there are an infinite number of incomparable resources at the top of this pre-order.
There is much scope for advancing and generalizing our work, some examples of which we now describe.
One of the most fundamental problems that is yet to be solved is that of characterizing the equivalence classes of resources in the pre-order induced by single-copy deterministic conversion. That is, one would like a compressed representation of each resource that includes all and only information that is relevant to determining its equivalence class in this pre-order. Finding such a representation would be the analogue within our resource theory of proving that the equivalence classes of pure bipartite entangled states under LOCC [83] are given by the Schmidt coefficients of the state. All resource monotones could then be efficiently expressed in terms of this compressed representation, while all other parameters of a resource could be safely ignored.
Even among resources of type ( 2 2 2 2 ) (much less for resources of arbitrary type), we do not have a complete set of monotones for this pre-order. 38 Another interesting open problem is to connect the existing monotones to figures of merit for interesting operational tasks. E.g., does the value of the monotone M CHSH determine the extent to which a given resource can be used for key distribution or randomness generation [6][7][8][9][10][11][12][13][14]? Since the monotone M NPR is maximized for high-bias boxes from the R Tilt (θ) family (and by the Hardy box) as opposed to by the Tsirelson box, M NPR is likely a figure of merit for operational tasks where the advantage is provided by such correlations [81,84].
Note that in deriving our results about properties of this pre-order, we have not needed to consider any types of resource beyond ( 2 2 2 2 ) , that is, it has sufficed to consider Bell experiments of the CHSH type. It may be that more nuanced features of this pre-order only become apparent for more general types of resources.
An obvious generalization of our work is to consider the pre-order induced by different sorts of conversion relations, such as indeterministic singlecopy conversion 39 , multi-copy conversion, asymptotic conversion, and conversion in the presence of 38 Although considerations of the examples given in Section 7.2 might provide the intuition necessary to find such a complete set for resources of type ( 2 2 2 2 ). 39 Indeterministic single-copy conversion is single-copy conversion that makes use of a post-selection. Therefore, to contemplate this notion of conversion for our resource theory is to a catalyst (see Refs. [23,85,86] for a discussion of these different notions, and Refs. [87][88][89][90][91] for relevant examples of such generalized conversions). Other generalizations require changes to the enveloping theory of resources one is considering. We have noted that our definition of the free operations can easily be extended to define a resource theory of nonclassicality for box-type processes in more general causal structures, distinct from that of a Bell experiment. For example, as discussed in Appendix. A.3, it can be extended to a scenario we term the triangle-with-settings scenario [52, Fig. 8], of which the much-studied 'triangle scenario' [92][93][94][95] is a special case. Another example would be to extend our definition to the 'bilocality scenario' [52,[96][97][98][99][100]. The analysis of such cases is complicated by the fact that our proposal implies that the set of free operations is not convex for them. Another such generalization would be to causal structures wherein there are cause-effect relations between different parts of the experiment, for instance, experiments involving sequences of nondestructive measurements on parts of a shared resource, such as the causal structure known as the 'instrumental scenario' [42,[101][102][103][104][105].
A generalization of our resource theory in a different direction is to consider processes whose inputs and outputs are not classical (i.e., processes that are not 'box-type'), but rather describe quantum or post-quantum systems. For the case of the commoncause structure which we focused on here, a quantum resource theory of this sort would subsume entanglement theory, but where quantum correlation is defined relative to the set of local operations and shared randomness (LOSR) rather than local operations and classical communication (LOCC).

Acknowledgments
The contemplate expanding the set of free operations from LOSR to LOSR with post-selection. However, LOSR with postselection can map a correlation P XY |ST that satisfies the Bell inequalities to one that violates them, and even to one that violates the no-signalling condition. (This is in contrast to the situation with LOCC, where allowing postselection does not change the set of states that one can prepare for free.) Consequently, what sort of correlation is consistent with a classical common cause-and hence what should be deemed free in a resource theory of nonclassicality of common cause boxes-becomes contingent on what sort of postselection was implemented. For example, in a Bell experiment wherein detectors are not perfectly efficient, postselecting on detection can induce Bell inequality violations even in the absence of a nonclassical common cause. However, for a given value of the detection efficiency, this might only be able to explain a particular degree of violation, while any higher violation would still attest to the presence of a nonclassical common cause. In such a context, the boundary between the correlations that are consistent with a classical common cause and those that are not would no longer coincide with the facets of the Bell polytope. Consequently, even defining the free set of resources becomes quite complicated when postselection is allowed.

A Comparing our framework with prior work
Correlations that violate Bell inequalities have become an important object of study, not only for their relevance in foundational aspects of quantum theory, but also for their role as a resource in quantum informationprocessing tasks [6][7][8][9][10][11][12][13][14]. Hence, particular effort has been devoted to the formulation of a resource theory describing them [15,16,18,19]. Two sets of free operations have previously been proposed to define such a resource theory, namely LOSR [16][17][18][19] which we have developed in the main text, but also wirings and prior-to-input classical communication (WPICC) [15].
In this section, we assess WPICC from the lens of our resource theory, and we identify an inconsistency among previous proposals for the definition of LOSR. The primary differences between our approach and previous approaches become most evident when one considers the question of how to develop such a resource theory for more general causal structures, as we discuss further on in Appendix A.3.

A.1 WPICC versus LOSR as the set of free operations
The set of WPICC operations allows for classical causal influences among the wings prior to when the parties receive their inputs. An example of a free operation in the WPICC approach is depicted in Fig. (9). If one seeks to understand the resource as nonclassicality of common-cause processes, as we do here, then it is clear that the free operations should not include any cause-effect influences between the wings, and therefore should not include any classical communication between the wings. In other words, in our approach, WPICC is not a viable choice for the set of free operations.
One might think that the choice to take WPICC or LOSR to be the set of free operations is not a particularly consequential one, since WPICC and LOSR define the same partial order for boxes [18] (see the discussion of this point in Sec. A.2.1). This equivalence breaks down, however, when one considers more general resources, e.g., bipartite quantum states. Since bipartite quantum states have no inputs, allowing classical communication prior to inputs means allowing arbitrary classical communication. Hence, WPICC coincides with LOCC in this case, and LOCC defines a partial order on bipartite quantum states that is distinct from the partial order defined by LOSR [65].
A.2 An oversight in the literature concerning how to formalize LOSR As we noted in the Introduction and in Section 2, the intuitive notion that the set of free operations should constitute local operations supplemented by shared randomness is widely agreed upon in previous work [15][16][17][18][19][20][21][22]. Nonetheless, some prior work seems to have formalized this intuitive notion incorrectly. Specifically, the set of free operations defined in Ref. [18] (and repeated in Refs. [20,21]) does not coincide with the set of free operations defined in Refs. [16,17] and which we endorse here as the correct choice. Rather, it is a nonconvex subset thereof, as we will show here. (Note that Ref. [18] referred to their set of free operations as "LOSR" but we will here reserve that term for the set of operations described in Definition 2.) We suspect that the discrepancy in the definitions introduced in these papers was merely an oversight, and in particular, that none of the authors of these articles would advocate for this nonconvex subset over the full set. Nonetheless, we think that it is important to highlight this oversight, so that it may be avoided in future work.
It is easiest to see the difference between the definition of the free operations given in Ref. [18] and the one endorsed here (which coincides with the definitions of Refs. [16,17]) by considering the diagrammatic representation of a generic operation in each case. The most general free operation proposed by Ref. [18] is depicted in their Fig. 1(a), which we reproduce here as Fig. 10, which should be compared with Figs. 2 and 3 of our article. The difference is that in Fig. 10, the side-channels on each wing that carry information forward from the pre-processing to the post-processing are limited to carry information only about the setting Figure 9: An example of a free operation in the WPICC approach, using the diagrammatic conventions of this article. (Compare with Fig. 1(b) of Ref. [18].) Here, we see an example in which there is communication from the left wing to the right wing, which (in contrast to our approach) is allowed for free in the WPICC approach, for all times prior to when the wings receive the inputs S and T .  Fig. 1(a) of Ref. [18] using the diagrammatic conventions of this article. The set of operations having this form is not as general as those depicted in our Fig. 3 because the post-processing does not have complete access to the shared randomness available at the pre-processing. One can explicitly show that the set of operations having this form is not convex. It is only after taking the convex closure of the set of operations depicted here that one recovers LOSR.
variables (S, S , T and T ), while in Figs. 2 and 3, they can also carry information about the common cause that acts on the local pre-processings.
This difference is also reflected in the equations. The most general free operation proposed by Ref. [18] is defined via their Eq. (7). In terms of the notation of this article, their Eq. (7) asserts that where P ST |S T (denoted I (L) in Ref. [18]) and P X Y |XY ST S T (denoted O (L) in Ref. [18]) represent, respectively, the pre-and post-processings (depicted in Fig. 10). Consistently with their Fig. 1(a), the expression for the post-processing stipulates that the side-channel between the pre-and post-processings only carries information about the setting variables S, S , T and T . The analogue of this equation for the proposal endorsed here is where Z A and Z B represent the variables propagated along the side-channels in Fig. 3. This is more general, given that Z A and Z B can encode information about the common cause in the pre-processing. To see the difference more explicitly, consider how these expressions appear if one includes the common causes. Because the pre-and the post-processings in the proposal of Ref. [18] must depend on independent sources of shared randomness (by virtue of the restriction on the side-channels), we distinguish the common causes notationally using primed and unprimed variables. The post-processing is given by and the pre-processing is given by Putting these together, we have By contrast, the proposal endorsed here distributes a single source of shared randomness between the preand post-processings. If we consider the circuit depicted in Fig. 3 and note that the side-channels can now feed forward not just S, S , T and T , but the common cause as well, we see that we can express the most general free operation as follows (which is equivalent to Eq. (9)) The operational discrepancy between the two proposals is a consequence of the fact that Eq. (51) is strictly less general than Eq. (52).
One can intuitively expect a failure of convex closure for the set of operations depicted in Fig. 10 and described in Eq. (51), since the pre-processing and the post-processing have access to independent sources of shared randomness, and these two sources cannot generally be subsumed into a single source. To explicitly demonstrate the failure of convexity, we consider the following operations: While τ 1 and τ 2 are each free operations which can be realized using the circuit in Fig. 10, the transformation τ 3 defined by their mixture cannot be realized using this circuit. To see this, note that in Fig. 10, any correlations between S and Y can only be mediated by T , since the only variable in the causal past of both S and Y is the variable acting as the common cause of the pre-processing, which we will denote by Λ 1 , and the only means by which the value of Λ 1 could be communicated via the side channel is through T . But in this example, T does not vary and so cannot mediate any correlations; the point distribution on T screens off any correlation between S and Y . Hence, τ 3 , which exhibits perfect correlation between S and Y , cannot be realized in a circuit of the form of Fig. 10. It follows that the set of operations depicted in Fig. 10 is not convexly closed. Despite the definition of the free operations given in Ref. [18], in Appendix A of that article, the authors avail themselves of convex mixtures of operations of the sort described by their Fig. 1(a) and Eq. (7). However, a mixing operation is only allowed if the shared randomness required to implement it is present, and given that the shared randomness available for the pre-selection is independent from that which is available for the post-selection, an arbitrary mixing operation is not allowed under the free operations proposed by Ref. [18]. The use of convex mixtures in Appendix A of Ref. [18] is therefore inconsistent with the definition of the free operations provided therein.
The mistake of defining the free operations as this nonconvex subset of LOSR is repeated in Ref. [21]: Fig. 1 and Eq. (13) therein are reproductions of Fig. 1(a) and Eq. (7) of Ref. [18], and, like the latter, limits the side-channels to carry information only about the setting variables. It is also repeated in Ref. [20], where the formalization of a "noncontextual wiring" per Eq. (9) there utilizes a post-processing with randomness independent from that of pre-processing, again restricting the side-channels to exclusively information pertaining to the setting variables.
The above discussion has highlighted the fact that if one wishes the set of free operations to include arbitrary convex mixtures of some smaller set, it is important that it be stipulated precisely how the shared randomness is distributed in order to ensure the possibility of such mixing. In this regard, although de Vicente [16] provided a definition of the free operations that is equivalent to LOSR, the physical justification for this choice was wanting. Specifically, the definition in Ref. [16] proceeds by enumerating a long list of nominally 'elementary' operations and then stating (in Section 4.1 of that article) that any mixture of these operations is also allowed. No discussion is provided of why the type of shared randomness necessary for achieving arbitrary mixtures should be considered freely available. The work of Geller and Piani [17], by contrast, does stipulate the physical structure of the circuit that defines the free operations, thereby providing a physical justification for taking LOSR as the set of free operations.

A.2.1 Previous results in light of this oversight
Given that some previous work [18,20,21] formally defined the set of free operations by Eq. (47), which yields a nonconvex subset of LOSR, one might wonder to what extent the results reached by those works still hold for LOSR proper, as defined in Eq. (48). In the following, we will briefly comment on some results described in Refs. [18] and [21].
Lemma 6 of Ref. [18] purports to demonstrate that if a function is a monotone relative to a set of operations that the authors term "LOSR", then it is also a monotone relative to WPICC. If one interprets the set of operations termed "LOSR" by the authors of Ref. [18] in the manner of the definition stipulated by their Fig. 1(a) or their Eq. (7) (which is equivalent to Eq. (47) above), namely, as a nonconvex subset of LOSR proper, as defined in Eq. (48), then the question would arise as to whether an analogous lemma holds for LOSR proper rather than simply the nonconvex subset thereof. In fact, however, the proof of Lemma 6 in Ref. [18] assumes that the set of free operations can map a resource R to a convex combination of R with any local box. This is not possible if the set of free operations is the one defined by their Fig. 1(a) or Eq. (7) (or equivalently, by Eq. (47) above). Hence, the proof of Lemma 6 holds only if the set of free operations termed "LOSR" in the statement of the lemma is taken to be LOSR proper, as defined in Eq. (48), and not the nonconvex subset of LOSR defined by Eq. (47). This fact provides yet another piece of evidence that the nonconvexity of the formal definition of the free operations in Ref. [18] was merely an oversight. The bottom line is that the proofs provided in Ref. [18] do establish that monotonicity relative to LOSR proper implies monotonicity relative to WPICC.
Kaur et al. [21] state in their Proposition 6 that their proposed "intrinsic non-locality" measure is monotonically nonincreasing under the set of free operations they term "LOSR". But given that their definition of this term is precisely the same as the definition provided in Ref. [18], the set of operations in question is the nonconvex subset of LOSR defined by Eq. (47). This prompts the question of whether this proposition holds if one considers LOSR proper, as defined in Eq (48), rather than this nonconvex subset thereof.
The answer is that it does. Establishing this is nontrivial, however, as an arbitrary monotone relative to the nonconvex subset of LOSR defined by Eq. (47) need not be a monotone relative to LOSR. Note, however, that if (i) a function f is a monotone relative to LDO, and (ii) f happens to be a convex function, then f is also a monotone relative to LOSR, as a consequence of Proposition 10. Since LDO is contained within the nonconvex subset of LOSR defined by Eq. (47), convex monotones relative to those limited operations are also valid monotones relative to LOSR proper. Finally, we can use this implication to recover Ref. [21]'s Proposition 6 by leveraging Proposition 7 there regarding the convexity of "intrinsic non-locality" over box-type resources.

A.3 Generalizing from Bell scenarios to more general causal structures
In the introduction, we contrasted our approach to defining a resource theory, which we termed the causal modelling paradigm, with a pre-existing approach, which we termed the strictly operational paradigm. Considering causal scenarios beyond Bell scenarios helps to clarify the differences between these two approaches.
Consider, for instance, a tripartite box-type process, with setting variables for the three wings denoted S, T , and U , and outcome variables for the three wings denoted X, Y , and Z respectively. One can distinguish two distinct causal structures that could underlie this sort of process: (i) the tripartite Bell scenario, where there is a common cause acting on all the three wings, depicted in Fig. 11, and (ii) the triangle-with-settings scenario [52, Fig. 8], where there is a common cause for each pair of wings, depicted in Fig. 12.  Consider the case of a generic box in the tripartite Bell scenario, depicted in Fig. 11(a), and label the systems distributed to the three wings by A, B and C respectively. Let us denote by r A x|s the GPT representation of the X = x outcome of the S = s measurement on system A, and similarly define r b y|t and r C z|u . If s ABC denotes the GPT state of the composite ABC, then the conditional probability distribution associated to this box is When the GPT is classical probability theory, we obtain the classically-realizable box shown in Fig. 11(b), and the conditional probability distribution associated to it is Now consider a generic box in the triangle-with-settings scenario, depicted in Fig. 12(a). Instead of an arbitrary joint GPT state s ABC on the triple of systems associated to the three wings, each system is composed of two parts-A is composed of A 1 and A 2 , and similarly for B and C-and the joint GPT state has the form s A1B1 ⊗ s A2C1 ⊗ s B2C2 . The conditional probability distribution associated to this box is When the GPT is classical probability theory, we obtain the classically-realizable box shown in Fig. 12(b).
Taking Λ A = (Λ A1 , Λ A2 ), and similarly for Λ B and Λ C , we have As we see, the form of the GPT-realizable boxes in the tripartite Bell scenario differs from the form of the GPT-realizable boxes in the triangle-with-settings scenario. Similarly for the form of the classically realizable boxes. These differences have consequences when one compares the strictly operational paradigm with our causal modelling paradigm, as we argue next. We begin by considering what each paradigm implies for the definitions of the free and enveloping sets of resources for each scenario.
For the tripartite Bell scenario, the definitions of both the enveloping process theory and the free subtheory of processes that are natural from the perspective of the causal modelling paradigm can also be expressed in a way that is natural within the strictly operational paradigm. Specifically, the boxes in the enveloping theory, which we take to be those that are realizable in a GPT causal model of this scenario (formalized in Eq. (53)), can also be characterized as those that are nonsignalling between the wings. Similarly, the boxes in the free subtheory, which we take to be those that are realizable in a classical causal model of this scenario (formalized in Eq. (54)), can also be characterized as those that are mixtures of deterministic boxes which are nonsignalling between the wings.
For the triangle-with-settings scenario, on the other hand, the set of boxes realizable in a GPT causal model for that scenario (formalized in Eq. (55)), is a strict subset of the boxes that are nonsignalling between the wings, and the set of boxes that are realizable in a classical causal model for that scenario (formalized in Eq. (56)) is a strict subset of the set of boxes that are mixtures of deterministic boxes that are nonsignalling between the wings. In both the enveloping theory and the free subtheory, the set of boxes is characterized via nontrivial inequalities in addition to merely the equalities that represent the no-signalling constraints. See Ref. [93] for a discussion of these inequalities in the special case of trivial setting variables. Consequently, within the causal modelling paradigm, the resource theory associated to the triangle-with-settings scenario and the resource theory associated to the tripartite Bell scenario differ in both the choice of enveloping theory and free subhteory. Within the strictly operational paradigm, however, it is unclear whether there is any natural way to pick out the enveloping theory and free subtheory that the causal modelling paradigm dictates for the triangle-with-settings scenario because it is unclear whether there is any natural way of picking these out by referring merely to the input-output functionality of the boxes. Now, we shift our attention to what each paradigm implies for the definitions of the free operations in each scenario. We will show that the definitions that are natural within the causal modelling paradigm cannot be easily motivated within the strictly operational paradigm.
The free operations prescribed by the causal modelling paradigm for the tripartite Bell scenario are depicted in Fig. 13. They are of the form which is clearly a convex set. This, we believe, is the appropriate definition of local operations and shared randomness for three parties.
This scenario does not show much difference with what would be natural in the strictly operational paradigm, because one can motivate taking this set of operations to be free on the grounds that they take nonsignalling boxes to nonsignalling boxes (even though, as in the case with the bipartite Bell scenario, the set of WPICC operations between the three wings can also be motivated in this way).
It is the free operations in the triangle-with-settings scenario that really distinguishes the causal modelling paradigm from the strictly operational paradigm.
The free operations prescribed by the causal modelling paradigm for the triangle-with-settings scenario are depicted in Fig. 14. They are of the form Note that this is not a convex set. Furthermore, since a triple of pairwise common causes can be simulated by a triplewise common cause, the free operations defined in Eq. (58) are a strict subset of the tripartite LOSR operations defined in Eq. (57). It follows that, just as we saw for the free boxes in the triangle-with-settings scenario, one cannot motivate the free operations defined in Eq. (58) by appeal to the no-signalling principle. And, again just as we noted for the free boxes, it is unclear how such a choice could ever be motivated by a principle that appealed only to the input-output functionality of the operation. The triangle-with-settings scenario also illustrates why one should not mathematically impose convex closure of the set of free operations, as was done in Refs. [18]. Rather, whether or not the set of free operations is convexly closed depends on the causal structure, which specifies precisely how randomness is shared among the parties. For Bell scenarios, the set of free operations is convex by construction, whereas for other causal structures, such as the triangle-with-settings scenario, it is not. Mathematically imposing convex closure in the triangle-with-settings scenario would be equivalent to asserting that there was a common cause for all three wings, which would constitute a change in the causal structure being considered. In other words, imposing convexity in an ad-hoc manner contradicts the foundations of the causal modelling paradigm, where it is the causal structure that specifies how randomness is shared among the parties, and consequently specifies whether or not convexity holds.
Note finally that the lack of convexity in general causal structures (such as the triangle-with-settings scenario) implies that the project of quantifying nonclassicality in these cases will be much more complicated than it was in the Bell scenario. In this section, we present some arguments that aid in justifying Proposition 14, the proof of which is given at the end of this appendix. Recall Proposition 14: where α is the value appearing in the decomposition R = γ L bb R + (1−γ)C k (α), where C k (α) ∈ C NPR,k , L bb R ∈ L bb k and γ ∈ [0, 1]. This value of α is unambiguous because there exists a unique resource L bb R ∈ L bb k and a unique choice of γ ∈ [0, 1] and of α ∈ [0, 1] such that R = γ L bb R + (1−γ)C k (α). Fig. 5 (for the case where k = 0). We first demonstrate the equivalence of three statements which pertain to the value of M NPR (R) for the subset of resources that satisfy CHSH(R) ≥ 2:

The (unique) relevant decomposition is shown in
Proposition 24. For any resource R of type ( 2 2 2 2 ) such that CHSH(R) ≥ 2, the following definitions are equivalent to M NPR (R): if R ∈ C NPR : CHSH(R), else if R ∈ C NPR : 2α+2, where α, γ ≥ 0, and L bb R ∈ L bb are all unique in the decomposition R = γ L bb R + (1−γ)C(α). (59c) Proof of Eq. (59a). Eq. (59a) is directly equivalent to the definition of the M NPR monotone given in Eq. (36). Hence, we take that as our starting point, and prove the implications of the subequations in Proposition 24.
Proof that Eq. (59a) ⇔ Eq. (59b). Section 5 guarantees that C(α) −→ R if and only if we can generate R by convex mixtures of C(α) with the images of C(α) under LDO operations. For any R / ∈ C NPR such that CHSH(R) ≥ 2, we simplify the situation by proving that if R can be generated by mixing C(α) with its images under LDO, then R can alternatively be generated by mixing C(α) with a local point which saturates the CHSH inequality; namely, a point in L b , as stated in Eq. (59b). To prove this, it is useful to define the notion of a screening-off inequality. 2 2 ) ⊂ S free ( 2 2 2 2 ) . Screening-off inequalities are useful when making statements about resource convertibility, as follows. Consider the case where we ask whether R 2 −→ R 1 : if R 1 lies inside some screened-off region, then, given Proposition 10, R 1 ∈ P LOSR [R1] (R 2 ) if and only if R 1 is in the convex hull of those images of R 2 under LDO inside the screened-off region, together with the boundary (where the inequality is saturated). Formally, if f (R) ≥ b is a screening-off inequality for resources of type [R 1 ], then, given Proposition 10, Since CHSH(R) ≥ 2 is a screening-off inequality whose saturation-boundary is given by L b , and since the only image in V LDO ( 2 2 2 2 ) (C(α)) which violates the CHSH inequality is C(α) itself, it follows that The equivalence Eq. (59a) ⇔ Eq. (59b) follows. As a final comment, notice that this characterization of convertibility in terms of the existence of a geometric decomposition involves arbitrary points which saturate the CHSH inequality, and there are typically many such decompositions.
Proof that Eq. (59b) ⇔ Eq. (59c). Recall that Eq. (59b) involves a minimization under the constraint that α is such that R = γ L b R + (1−γ)C(α). We can formally recast it as a constrained optimization problem, as follows: Essentially, this is a constrained optimization problem with a linear objective subject to one nonlinear constraint; namely, that the smallest conditional probability in the expression L b R 40 must be nonnegative. For such optimization problems, it is always the case that the objective is maximized when the constraint is not merely satisfied but saturated. Put another way, the set of achievable α arise from points L b R wherein all conditional probabilities are nonnegative, but the optimal α arises for some unique L b R = L bb R where the smallest conditional probability in L bb R is precisely zero. Proof. A set of monotones is complete relative to a family of resources if and only if every candidate conversion among resources in the family which is not ruled out by any of the monotones in the set is in fact possible for free, as per Eq. (21). In Fig. 15(a), we depict in blue the set of candidate conversions (from a generic resource R(α 1 , γ 1 ) to another resource in the family) which are not ruled out by {M CHSH , M NPR }; namely, the blue shaded region contains all resources which have a value for each of the two monotones that is equal to or lower than that of R(α 1 , γ 1 ). To prove the proposition, we argue that R(α 1 , γ 1 ) can indeed be converted to any resource in the blue region. By convexity, it suffices to prove that R(α 1 , γ 1 ) can be converted to each of the four extreme points of the blue region. Since L bb and L b NPR are free resources, R(α 1 , γ 1 ) can freely be converted to either of them, and the resource R(α 1 , γ 1 ) can obviously be 'converted' to itself, as the identity is free. Our proof, therefore, focuses on demonstrating that R(α 1 , γ 1 ) can indeed be converted to the fourth extreme point R(α 2 , 0), shown as a green star. We now give the explicit free operation which takes a generic initial resource 40   as were introduced in Fig. 6. We consider a two-parameter family of resources. A generic such resource, specified by α1 and γ1, is marked by a red diamond. Also depicted are some of the level curves of the two monotones M CHSH and M NPR . The solid dark blue region denotes the set of all resources within this family which have values for both monotones less than or equal to their values for R(α1, γ1). To prove Proposition 16, one must show that R(α1, γ1) can be converted to any resource in the solid blue region. The critical step in this proof is the demonstration that it is possible to convert any resource to one lying on the line connecting R PR and L b NPR without changing the value of M CHSH . Graphically, this corresponds to converting the generic resource R(α1, γ1) to the resource R(α2, 0) marked by a green star. R(α 1 , γ 1 ) and projects it onto the chain leftwards in the two-dimensional coordinate system of Fig. 15(a), i.e., to the target resource R(α 2 , 0), where α 2 = α 1 (1 − γ 1 ) 41 .
We denote the free operation which enacts this conversion by τ erase-γ ; it is the operation which projects any resource into the subspace of resources that are invariant under the G 456 subgroup of LSO ( 2 2 2 2 ) (G 456 is defined in Proposition 11c on page 21), i.e., onto the chain C NPR . 42 This operation is indeed free, as it can be constructed by a uniform mixture of all the elements of G 456 , each of which is free. Recall that G 456 is the subgroup of LSO ( 2 2 2 2 ) which stabilizes CHSH 0 , and therefore clearly does not modify the value of the M CHSH monotone.
It remains only to show that the G 456 -invariant subspace of resources within the set of all ( 2 2 2 2 ) -type resources for which CHSH(R) ≥ 2 is the chain C NPR , i.e., the line of points between R PR and L b NPR . This is evident by confirming that τ erase-γ leaves R PR invariant, but maps each of the 8 deterministic CHSH-saturating boxes to L b NPR . Those 1+8 resources are the extreme points of the set of all ( 2 2 2 2 ) -type resources such that CHSH(R) ≥ 2; since the extreme points map to the line under the action of τ erase-γ , by convex linearity it follows that the chain is the only space invariant under G 456 within the two-parameter family.
Before presenting the proof, we introduce some additional concepts and a few lemmas on which our proof relies. Throughout the following, we are focused on sets of resources of fixed type, and on type-preserving operations. Hence, we use slightly abbreviated notation; e.g. V LDO (R) is used as shorthand for V LDO [R] (R), and so on. We bring up the property of sensitivity because (i) it is straightforward to test if a given resource is sensitive or not by means of a linear program, and (ii) eventually we will argue that if a resource is sensitive, then it is also orbital. Furthermore, we now prove that sensitive resources never appear in isolation. That is, a single sensitive resource can be used to construct a set of sensitive resources, as follows: Lemma 28. For any resource R, every resource R which is below R in the pre-order and which cannot be generated from R by mixtures of LDTNO operations is Formally: the set of resources S R sens := P LOSR (R) \ HullLDTNO(R) is always sensitive.
Proof. First, note two related, useful facts: (1) The composition of any deterministic operation (invertible or not) followed by some deterministic nonsymmetry operation is precisely some (other) deterministic nonsymmetry operation. Formally, if τ LDTNO ∈ LDTNO and τ LDO ∈ LDO, and defining τ := τ LDTNO •τ LDO , then τ ∈ LDTNO. A consequence of this is that the entire set P LOSR (R) is mapped to the set HullLDTNO(R) under LDTNO and convex mixtures thereof. To see this, recall that the image of any convex set of resources under any convex set of operations is identically the convex hull of the images of the extremal resources under the extremal operations (in the respective sets). We use this fact to effectively replace P LOSR (R) with V LDO (R) and to replace convex mixtures of LDTNO with LDTNO itself, without loss of generality. In summary: HullLDTNO(P LOSR (R)) = HullLDTNO(R) by virtue of the fact that LDTNO • LDO = LDTNO.
(2) The composition of any deterministic nonsymmetry operation followed by some deterministic operation (invertible or not) is some (other) deterministic nonsymmetry operation. Formally, if τ 1−LDTNO ∈ LDTNO and τ 2−LDO ∈ LDO, and defining τ 4 := τ 2−LDO • τ 1−LDTNO , then τ 4 ∈ LDTNO. A consequence of this is that the entire set HullLDTNO(R) is mapped to itself under LOSR. To see this, we reuse the shortcut of considering only extremal resources and extremal operations. Specializing to our objects of interest, we effectively replace the operations-set LOSR by its extremal operations -namely LDO -and the resources-set HullLDTNO(R) by V LDTNO (R) without loss of generality. In summary: P LOSR (HullLDTNO(R)) = HullLDTNO(R) by virtue of the fact that LDO • LDTNO = LDTNO. Now we are in position to prove Lemma 28. The set of resources R below R in the partial order is identically P LOSR (R). The set of resources which can be generated from R by mixtures of deterministic nonsymmetry operations is identically HullLDTNO(R). So, a resource R is below R in the partial order and cannot be generated from R by mixtures of deterministic nonsymmetry operations if and only if R ∈ P LOSR (R) \ HullLDTNO(R) =: S R sens . Now, consider any τ ∈ LDTNO and any R ∈ S R sens , and define R := τ • R . Since we have established that the entirety of P LOSR (R) is mapped to HullLDTNO(R) under LDTNO, it follows that R ∈ HullLDTNO(R). However, since we have also established that the entirety of HullLDTNO(R) is mapped only to itself under LOSR, and since R / ∈ HullLDTNO(R), it further follows that R −→ R . Evidently, any R ∈ S R sens is removed from its equivalence class by every deterministic nonsymmetry operation, i.e., S R sens is sensitive. This proves the Lemma.
Note that Lemma 28 implies that if R is sensitive, and R is equivalent to R, then R is also sensitive.

Lemma 29.
If a resource is sensitive, then it is also orbital. That is, if two sensitive resources are interconvertible under type-preserving LOSR, then they are interconvertible under LSO.
Proof. Let R and R be distinct sensitive resources that are interconvertible under type-preserving LOSR, i.e., R = R but R ←→ R and [R] = [R ]. Any operation which preserves the equivalence class of a sensitive resource can be expressed as a convex combination of elements of LSO. The assumption of sensitivity thus dictates that R is in the convex hull of the images of R under LSO, and vice versa. We proceed to show that this sort of relationship must imply that R ∈ V LSO (R ) and R ∈ V LSO (R), that is, that R and R are LSO-equivalent. 43 This can be seen by recognizing that the 2-norm is a convex function invariant under LSO, meaning R ∈ ConvexHull V LSO (R) implies R 2 ≤ R 2 . 44 By symmetry under exchange of R and R , it holds that R 2 ≤ R 2 , and hence R 2 = R 2 . The 2-norm, moreover, strictly decreases under nontrivial stochastic mixing; 45 hence all interconversions between equivalent sensitive resources must be mediated by deterministic symmetries. Formally: 1 and  {w 1 , ..., w n , w 1 , ..., w n } ≥ 0, then R 2 = R 2 and w i , w i ∈ {0, 1}.
Lemma 30. R PR is a sensitive resource, and P LOSR (R PR ) \ HullLDTNO(R PR ) is the entire eight dimensional set of all nonfree resources of type ( 2 2 2 2 ) . Proof by inspection. One can readily verify that τ • R PR ∈ S free ( 2 2 2 2 ) for all type-preserving LDTNO operations τ .

Proof of Proposition 18.
Lemma 30 together with Lemma 28 immediately imply that all nonfree resources of type ( 2 2 2 2 ) are sensitive. Lemma 29 then directly implies that all these resources are orbital.
A final comment: consider generalizing Proposition 18 in light of the discussion just given. If one desires to construct an orbital set of resources beyond ( 2 2 2 2 ) -type, one needs only to find some single sensitive resource R of the desired type. From Lemmas 28 and 29, it then follows that the set of resources P LOSR (R) \ HullLDTNO(R) constitutes an orbital set. It might be the case, for instance, that for any nontrivial choice of resource type, there is at least one convexly extremal resource that is sensitive, analogous to how the PR-box is a sensitive resource for type ( 2 2 2 2 ) .

B.4 Proof of Proposition 20: lower bound on the number of monotones in any complete set
Recall that a resource is termed orbital if and only if its LOSR-equivalence class of resources of the same type is equal to its LSO-equivalence class. We now prove Proposition 20, recalled below: Proposition 20. For any compact set S of resources that are all orbital, the intrinsic dimension of the set S is a lower bound on the cardinality of a complete set of continuous monotones for S (and for any superset of S). 43 The fact that R is in the convex hull of the images of R under permutations of R's probabilities is equivalent to stating that R vector majorizes R . The relationship is reflexive, however. Readers familiar with vector majorization may recall that two vectors are equivalent under the majorization order if and only if they are related by some reordering, i.e., a (not necessarily physical) symmetry operation. 44 Recall that R is shorthand for the representation of the resource in terms of a real-valued vector consisting of all possible conditional probabilities, i.e., R = P XY |ST (xy|st) : x, y, s, t ∈ {0, 1} . 45 Consider the hypersphere consisting of all resources with 2-norm in common with R. All the images of R under LSO lie on the surface of this hypersphere. Stochastic mixing of symmetry operations (applied to R) is equivalent to convexly combining different points from the surface of the hypersphere. Any convex combination of points from the surface of a hypersphere results in a final point strictly interior to the sphere. Strictly interior points are closer to the center, in precisely the sense of having a strictly smaller 2-norm.
Proof. The set of local symmetry operations for a given type has finite cardinality, and hence there are a finite number of resources in the LSO-equivalence class of any resource. For an orbital resource R , this implies that the LOSR-equivalence class of R (over resources of type [R]) is precisely V LSO (R), which is a finite set. If a compact set S of orbital resources has intrinsic dimension d, and the LOSR-equivalence class of every resource in the set is finite and hence zero-dimensional, then it follows that one can find d-dimensional compact subsets of resources in S in which no two resources are equivalent. 46 Hence, no two resources in such a subset are assigned the same tuple of values by any complete set of monotones. In other words, a complete set of n continuous monotones maps the subset of resources injectively to R n . But this map can only be injective if n ≥ d, which guarantees that the number of continuous monotones required to identify a resource in the set is at least as large as the intrinsic dimension d of the set S. Finally, note that the number of continuous monotones required to identify a resource in any superset of S must be at least as large as for the set S itself, which completes the proof.