Causal influence in operational probabilistic theories

We study the relation of causal influence between input systems of a reversible evolution and its output systems, in the context of operational probabilistic theories. We analyse two different definitions that are borrowed from the literature on quantum theory -- where they are equivalent. One is the notion based on signalling, and the other one is the notion used to define the neighbourhood of a cell in a quantum cellular automaton. The latter definition, that we adopt in the general scenario, turns out to be strictly weaker than the former: it is possible for a system to have causal influence on another one without signalling to it. Remarkably, the counterexample comes from classical theory, where the proposed notion of causal influence determines a redefinition of the neighbourhood of a cell in cellular automata. We stress that, according to our definition, it is impossible anyway to have causal influence in the absence of an interaction, e.g.~in a Bell-like scenario. We study various conditions for causal influence, and introduce the feature that we call no interaction without disturbance, under which we prove that signalling and causal influence coincide. The proposed definition has interesting consequences on the analysis of causal networks, and leads to a revision of the notion of neighbourhood for classical cellular automata, clarifying a puzzle regarding their quantisation that apparently makes the neighbourhood larger than the original one.


Introduction
In the last two decades, studies on the foundations of Quantum Theory flourished, nurtured by the wealth of results in quantum information theory and their impact on the understanding of the quantum realm. One of the many facets of quantum theory that were explored in this perspective is causal influence and the consequent analysis of causal structures [1,2,3,4,5,6,7,8,9]. In this line of thought, the main questions regard possible criteria to determine the compatibility of observed data regarding a network of systems and hypothetical rela-tions between them. The techniques adopted for this kind of analysis were initially developed in the context of classical Bayesian networks [10], or in the theory of quantum networks developed by different approaches [6,11,12,13,14,15,16]. A second interesting question analysed in the literature regards quantification of causal influence [9]. However, the need for abstracting the study of causal structures from the specific formalism of the theory adopted was pointed out already in Ref. [17], where the purpose was to build a microscopic theory of gravity that might require a step beyond quantum. In this perspective, a language that allows one to deal with general information processing structures without the burden of using a specific mathematical model is precisely the right tool.
The framework of Operational Probabilistic Theories (OPTs) [18,19,20], is a landscape of theories each of which could in principle provide a mathematical language for representing elementary systems and their processes, different form the classical or quantum one, but bearing important common features with those ones. OPTs are indeed defined as all the possible theories that share with classical or quantum theory some basic structure: in particular, the properties of rules by which one can form composite processes as sequences of other processes, form composite systems, or apply processes independently on subsystems of a composite system, as well as the properties of rules to calculate probabilities of events that can occur as alternative outcomes within a process. The mentioned features are sufficient to prove results that hold for all theories, or for wide classes of them, characterised by some operational property, e.g. purification [18,20], or n-local discriminabilty [21], and so on. Understanding entanglement [22] and its relation with complementarity [23,24], information and disturbance [25], as well as any other fundamental feature within a general OPT allows us to understand the feature itself beyond the contingent aspects that it acquires within a specific theory.
The language of OPTs is then perfectly suited to the purpose of analysing causal relations and causal structures beyond the classical or quantum realm. In the present study, we consider the relation of causal influence between systems within the OPT framework. We provide here a definition of causal influence in reference to a reversible evolution, based on how the evolution propagates the effects of an intervention, but we do not identify the propagated effect with the ability to signal, i.e. to transmit information. We then discuss the consequences of our definition, and provide a necessary and sufficient condition for its fulfilment. We study the relation between the proposed definition and the property of signalling, and show that in general our definition is strictly weaker than the traditional one based on signalling, i.e. one system can have causal influence on another one without signalling to it. An example of a theory where our definition is strictly weaker than signalling is classical information theory. This fact bears important consequences. As a remarkable example, think of the neighbourhood of a cell C in a cellular automaton: the neighbourhood of C is usually defined as the set of cells to which C can signal in one step. However, if we define the neighbourhood of C as the set of cells on which C can have causal influence in one step instead, the latter neighbourhood will be generally larger than the former one. Remarkably, some structural theorems for quantum cellular automata fail in the classical case, but keep holding if one redefines the neighbourhood according to the definition presented here (see e.g. [26]) We then study conditions under which the two definitions coincide. For this purpose, we introduce the property of no interaction without disturbance, which essentially means that every non trivial interaction with a system enables signalling to it. As one can intuitively expect, in a theory with no interaction without disturbance-e.g. Quantum Theory or Fermionic Theory-causal influence and signalling coincide. Finally we construct a useful tool for the analysis of the causal structure of a given process. Remarkably, the latter turns out to be very powerful in detecting causal influence relations, to the extent that the case of nonlocal activation of causal influence is excluded, and the analysis of the causal neighbourhood of a system can be carried out locally.
In everyday experience, in order to establish a causal influence relation between two systems one has to perform controlled interventions on one system and then look for consequences on the other one, after a given time interval. As an example, consider a point x 0 in space. If we adopt an inertial reference frame and activate an electromagnetic source at x 0 at time t 0 , assuming an isotropic medium, we expect a spherical wave front centred at x 0 to start propagating isotropically at the speed of light. One can then detect a pulse in any direction, e.g. at point x 1 after a time given by the radius of the sphere centred at x 0 and containing x 1 , divided by the speed of light in the medium. In this case we claim that the perturbation of the electromagnetic field detected at x 1 is a consequence of the initial intervention at x 0 . This way of revealing a causal influence relation between two systems mixes two distinct notions: the first one is based on testing the consequences of a controlled intervention occurred on the first system as they propagate to the second system, and the second one is based on identifying those consequences as the possibility of transmitting information. Here we decouple these two aspects, and focus on the first one.
The question underpinning our definition is the following: if we had to simulate the consequences of a hypothetical intervention on system A occurred before the evolution U , by intervening after U instead, would a non-trivial action on system B be required? If the answer is positive, we will say that the evolution U mediates a causal influence from A to B.

Operational probabilistic theories
In the present section we provide a brief review of the framework of operational probabilistic theories, with focus on those definitions that will be used in the remainder. For a comprehensive account of the subject, the reader is referred to [18,20,27,26].
An operational probabilistic theory provides a language to describe processes occurring on systems along with rules for the composition of processes in parallel or in a sequence, and other rules for the calculation of probabilities of events that might occur in a network of processes. The first set of rules defines an operational theory Θ, which consists in the following elements.

X
, where the label A → B emphasises the type of input and output system for the test, and will be omitted when the latter are manifest from the context, while X represents the finite set of outcomes corresponding to the individual events that might occur within the test. The collection of system types will be denoted by Sys(Θ). The set of tests of type A → B is denoted as A → B 2. An associative rule for sequential composition: the test T X ∈ A → B can be followed by the test S Y ∈ B → C , thus obtaining the sequential composition ST X×Y ∈ A → C . (b) Unit: there is a label I such that IA = AI = A for every A ∈ Sys(Θ).
A corresponding rule for composition of tests ⊗ : Also this rule has its properties (a) Associativity: (d) Braiding: for every pair of system types When S * AB ≡ S AB , the theory is symmetric. All the theories developed so far are symmetric.
All tests of an operational theory are (finite) collections of events: Similarly, for R X ∈ A → B and T Y ∈ C → D ,

The set of all events of all tests in
. By the properties of sequential and parallel composition of tests, one can easily derive associativity of sequential and parallel composition of events, as well as the analogue of Eq. (1). For every test T X ∈ A → B with T X = {T i } i∈X , and every disjoint partition {X j } j∈Y of X = j∈Y X j , one has a coarse graining operation that maps T X to We define T Xj := T j . If an element of the partition is trivial, i.e. X j = {i 0 }, then T Xj = T i0 . The parallel and sequential compositions distribute over coarse graining: Notice that for every test T X ∈ A → B there exists the singleton test T * := {T X }. One can easily prove that the identity test I A is a singleton: The unique event in a singleton test is called a channel, and the collection of channels with input A and output B deserves its own symbol: If the event A Xj belongs to a test that is obtained by coarse graining, for every i ∈ X j we will say that The collection of events of an operational theory Θ will be denoted by Ev(Θ). The above requirements make the collections Test(Θ) and Ev(Θ) the families of morphisms of two braided strict monoidal categories with the same objects-system types Sys(Θ).
An operational theory is an OPT if the tests I → I are probability distributions: 1 ≥ T i = p i ≥ 0, so that i∈X p i = 1, and given two tests S X , T Y ∈ I → I with S i = p i and T i = q i , the following identities hold Every transformation can thus be identified by the infinite family of such linear maps that it induces. In an OPT, coarse graining is represented by the sum: ] are called states, and denoted by lower-case greek letters, e.g. ρ, while events in [[Ā]] are called effects, and denoted by lowercase latin letters, e.g. a. When it is appropriate, we will use the symbol |ρ) to denote a state, and (a| to denote an effect. We will also use the circuit notation, where we denote states, transformations and effects by the symbols For composite systems we use diagrams with multiple wires, e.g.
The identity will be omitted: A I A = A . The swap S AB and its inverse will be denoted as follows In the present paper we will always assume that the theory under consideration is symmetric, however all the results can be straightforwardly generalised to the nontrivially braided case. We will consequently draw the swap S AB as An OPT Θ is specified by the collections of systems and tests, along with the parallel composition rule ⊗ Θ ≡ (Test(Θ), Sys(Θ), ⊗).

Defining causal influence
The kind of question we would like to address here is under what conditions are we allowed to say that, through the evolution given by C , interventions on system A can influence system B . In the first place, we remark that the question is meaningful without further discussion only within theories where no information is allowed to flow from the "future" towards the "past". As proved in Ref. [18], the latter requirement is equivalent to uniqueness of the deterministic effect e A for every system A of the theory. This property will then be assumed in the remainder.
Typically, the answer to this kind of question comes in the first place by stating when there is no influence from A to B , and then defining causal influence by negating the above condition. The no-influence relation normally boils down to the impossibility of using the channel C to send a message from A to B -the so-called no-signalling condition. Considering the question in the quantum realm allowed various authors to highlight some aspects that warn us about the possible answers. In the first place, when C is not reversible, one cannot trust conditions for no-influence. Indeed, considering that in quantum theory every channel can be viewed as the effective description of a reversible channel on an extended system, once we neglect the "environment", no-influence might be an accident due to the preparation of a special initial state of the environment rather than a structural feature of the channel C itself. The above issue is strictly connected, in the quantum case, with a further one: it may happen that A has no influence on A and on B individually, but has influence on the composite system A B . A typical example was exhibited in Ref. [28], where the authors show a channel that allows for no influence from A to either A or B and no influence from B to either A or B , but such that both A and B have influence on A B , and actually a reversible dilation of such channel highlights a finer structure: there are dilations where both A and B influence A and not B , and others where they both influence B and not A [29].
For the reasons listed above, we will define causal influence under reversible channels. For all those channels that can be obtained from reversible ones neglecting the environment as in the following scheme we will adopt only those influence relations compatible with each of its reversible dilations U . As discussed above, in quantum theory there are examples where influence relations depend on the particular dilation, and there is no influence relation that is actually independent of the reversible dilation U . It is probably useful to remind here that in all known theories every channel can be dilated to a reversible one, except for the theory of Popescu-Rohrlich boxes. The discussion of the definition of causal influence for irreversible channels is beyond the purpose of the present study.
We remark that outside the quantum, in particular in theories without local discriminability, the mentioned definition based on no-signalling can have a sort of "dual" issue to the one highlighted above: while neither A nor B signal to, say, A , it may happen that the composite system AB signals to A .
Finally, we anticipate here that the definition that we will give is not the usual one based on the possibility or impossibility of signalling. Our definition represents a generalisation of the traditional one, and is strictly weaker: we will show examples of nonsignalling channels that however allow for causal influence. In the special case of quantum theory the two definitions coincide.

The definition
We now give the definition of no causal influence from A to B . The definition differs from the usual one-given in terms of no-signalling conditions-and is taken from the literature on quantum cellular automata, where the neighbourhood of a given system is defined through the image of its local algebra of operators under conjugation by the unitary map representing the cellular automaton [30]. The definition for general OPTs, inspired by the latter, was given in Ref. [26].  Before checking the consequences of the present definition and finding equivalent conditions, we will comment on its relation with other definitions in the literature. The definition based on no-signalling is the following Now, let us consider the special case where E = I, and then discard system A on both sides of Eq. (4). We Then, Eq. (6) becomes where The question one may ask now is whether the two conditions A → U B and A U B are equivalent. The answer is negative, and surprisingly the counterexample comes from the simplest theory we have: classical theory. In classical theory system types are in correspondence with natural numbers n representing the number of distinct pure states of the system, e.g. for a bit pure states are ψ 0 and ψ 1 , thus n = 2. Parallel composition is simply given by the product rule: if A = m and B = n, then AB = mn. It will turn useful to introduce the symbols n := {0, 1, . . . , n − 1}. For the sake of simplicity, given x ∈ n, we will write x,y,z=0 where C ∈ [[4 → 4]] 1 is defined by the function g(x, y) = (y, y). It is then clear that condition (2) is violated, and thus A → K B . Indeed, in classical theory it is possible to copy information without disturbing, and thus it is also possible to interact with a system without disturbing. The C-not map K perfectly illustrates this situation: it is possible to extract information from B and leak it to A without affecting the state of system B (this is indeed the meaning of the no-signalling condition). However, our definition of causal influence also accounts for the presence of a non-disturbing interaction. Looking at condition (4), the interpretation of the causal influence relation is clear: negating Eq. (4) amounts to state that in order to replicate the effect of an intervention at A after the evolution through U one has to interact with system B . The difference between no-signalling and no causal influence is thus hidden, as far as we understand it from the example of classical theory, in the possibility of interacting without disturbing. We will come back on this observation in section 6, and show that this intuition can be partially turned into a general theorem.

Conditions for no causal influence
In the present section we prove a relevant equivalent condition for A → U B , and then discuss some necessary conditions. For our purposes it is useful to introduce the following definition

Definition 3. Let U ∈ [[AB → A B ]] 1 be a reversible transformation. We define the reversible channelT
We state now a very useful and powerful lemma involving the above defined transformationT A (U ).

Lemma 1. Let U ∈ [[AB → A B ]], and A ∈ [[EA → EA]]. Then the following identity holds
The proof of the above result is provided in Appendix A. The implications of Eq. (13) reach far beyond the results of the present paper. Here we exploit it to prove the following key result.

Theorem 1. Let U ∈ [[AB → A B ]] 1 be a reversible transformation. Then A → U B if and only ifT
where Proof. The condition in Eq. (14) is clearly necessary, as it follows from Eq. (2) for the special case where E = A 1 A and A = S is the swap channel for systems A 1 and A. Let us now prove that the condition is sufficient. Consider the result of lemma 1, i.e. Eq. (13). Now, if condition (14) holds, the above Eq. (13) becomes , from which one can easily conclude that , namely condition (2) is satisfied.

Necessary conditions
We already proved that no causal influence implies nosignalling. Now we provide a second necessary condition for no causal influence.

Theorem 2. Let U ∈ [[AB → A B ]] 1 be a reversible transformation. If A → U B , then U can be decomposed as follows
Proof. Let us consider the equivalent condition (14) for no causal influence, and rewrite it in the following form Applying both sides of Eq. (16) to ψ ∈ [[A]] 1 and discarding A 1 on both sides by the deterministic effect e A1 , we obtain the thesis, with E = A , and The last result allows us to provide an alternative, simpler proof of the fact that A → U B implies A U B .
Proof. It is sufficient to consider the decomposition in Eq. (15), then discard A via e A . One immediately realises that condition (3) is satisfied, with The latter proof highlights the following chain of implications that constitutes the general scenario in an OPT, In the following sections, we will analyse cases where the above implications become equivalences.

Classical theory
We discussed some general aspects of the notion of causal influence in classical theory in the introductory section. Now we get back to classical theory in the light of the results that we proved, and will show that, in classical theory, one has the following equivalence.
Proof. Let U be reversible. The transformation U is defined, as from Eq. (9), by two functions f (x, y) and h(x, y), where x ∈ m and y ∈ n, provided that A = m and B = n. Let also A = l. Let us suppose that condition (3) holds. Then This implies that, for every x, y, and for every g such that p(g) > 0, one has g(y) = h(x, y). This implies that p(g) = δ g,g0 , and h(x, y) = g 0 (y). Thus and it is now easy to verify that condition (15) is satisfied with As a consequence, while condition (15) is equivalent to A U B , there are reversible transformations U such that the same condition is satisfied while A → U B .

Quantum and Fermionic theory
In the case of Quantum Theory (QT) and Fermionic Theory (FT), it is possible to prove that the hierarchy of three conditions of Eq. (17) collapses in three equivalent conditions. That Eq. (15) is equivalent to A U B has been proved with various techniques in the literature [1,3,2,5,31]. We will now prove that condition A U B implies A → U B , which is the only missing link to close the circle of implications. The proof uses existence and uniqueness of purification, along with existence of a pure faithful state for every system. We remind here the definition of a faithful state for convenience of the reader. Definition 4. Let A ∈ Sys(Θ), and Ψ ∈ [[AA ]] 1 a pure state. We say that Ψ is faithful for A if the following mapping is injective is injective.
Let now Θ be a theory with unique purification and with a pure faithful state for every system A. Let also the theory be such that parallel composition of pure states is pure-a requirement often called atomicity of parallel state composition. Under these hypotheses one can prove that every test can be dilated to a reversible interaction of the system with an environment in a pure state, followed by an observation-test on the environment. The proof is not reported here, and can be found in [18,20], along with the proof of the uniqueness results. The precise statement follows.
We call the quadruple (B, D, η, U ) a reversible dilation of C .
One can prove further results about uniqueness of reversible dilations modulo reversible transformations on D (see Ref. [18]), however we will consider here only the result discussed in the following subsection, that will turn useful in the remainder, and that highlights the main difference between CT discussed above on one hand, and QT and FT on the other hand.

No interaction without disturbance
We start proving a theorem that holds in Quantum Theory, in Real Quantum Theoory, in Fermionic Theory, as well as in any theory satisfying the hypotheses.

Theorem 5 (No interaction without disturbance). Let a theory Θ satisfy the same hypotheses as lemma 2. If a transformation C ∈ [[AB → CB]] 1 satisfies
there must exist a channel D ∈ [[A → C]] 1 such that Proof. The proof is rather straightforward, and invokes lemma 2, the existence of a pure faithful state, as well as essential uniqueness of purification. First of all, by lemma 2 there exists (E, F, η, Let Φ ∈ [[AB(AB) ]] 1 be a pure faithful state for AB. Then condition (21) implies .
By essential uniqueness of purification there must ex- .
As Φ is faithful, we conclude that Eq. (22) holds with The above result holds under the hypotheses specified in the statement. However, it can be taken as a principle, that we call no interacrtion without disturbance. It holds as a theorem in QT, RQT, and FT. However, the results that we prove in the following hold under the general condition of no-interaction without disturbance, that can be precisely stated as the requirement that the above theorem 5 holds.

Definition 5 (No interaction without disturbance).
We say that a theory Θ satisfies no interaction without disturbance if for every pair of composite systems AB and CB, and for every channel C ∈ [[AB → CB]], one has that, if the condition in Eq. (21) holds, then the channel C has the form given in Eq. (22).
Before proving the collapse of conditions, we need a further lemma.

Lemma 3. Let U ∈ [[AB → A B ]] 1 be reversible, and
Proof. The thesis follows straightforwardly after applying e A on both sides of the identity We are now ready to prove the main result of the present section.
Proof. Let us consider the following identity, following from the hypothesis of no-signalling from A to B Now, considering that by virtue of lemma 3 one has we can invoke definition 5 to conclude that there must exist The latter is precisely condition (14).
In conclusion, we then proved that in a theory where no interaction without disturbance holds, the two conditions of no signalling and no causal influence are equivalent. This is then true, in particular, in theories where the purification property holds.

Interaction without disturbance
What can we say in the case of violation of no interaction without disturbance? In order to analyse this question, it is useful to prove the following result.

Theorem 7. Let A → U B, and suppose that
Proof. The hypothesis that A → U B can be expressed equivalently by Eq. (16). Combining the latter with Eq. (25), we have , and inverting U on both sides, and composing with an arbitrary deterministic state of A 1 , one has By reversibility of U , one has that also V has to be invertible. Finally, this implies the thesis.
As a consequence of the above result, if a theory Θ has interactions without disturbance, and one of those-say U ∈ [[AB → AB]]-is reversible, by definition U satisfies Eq. (25) but not Eq. (26). Then, by virtue of theorem 7, it cannot hold that A → U B.

As a consequence, it is not true in
The T -process  (14), we observe that the (reversible) channel T Ai (U ) allows us to immediately identify the systems in N + Ai . Indeed, by definition of N + Ai , we have the following result , and consider the input system A i . Theñ whereX denotes the complement of X in {A 1 , A 2 , . . . , A M }, and for every A j ∈ N + Ai one cannot have In other words, N + Ai is the largest subset X of By the above arguments it is clear that . Moreover, since the swap operator S Ai for systems A i andÃ i A i commutes with S Aj for A i = A j , and conjugation by a reversible transformation U ⊗ I preserves commutation, one has where we implicitly pad the transformations T A k (U ) with identity transformations on suitable systems in order to make them act on the same system. We omit the identity transformations for the sake of a light notation.
We call the transformation T Ai (U ) the T -process of U relative to A i . The advantage of defining the T -process is that it is easily calculated through Eq. (12), and it contains all the information needed to determine N + Ai through the condition given by lemma 4. Moreover, by a calculation identical to that of Eq. (13), one can easily show that if one considers the composite system A i A j , one has N + AiAj = N + Ai ∪ N + Aj , and T AiAj (U ) = T Ai (U )T Aj (U ) = T Aj (U )T Ai (U ). In other words, the T -process detects all causal influence relations system-wise, and this excludes activation of causal influence by nonlocal interventions.

Conclusion
In this study we analysed the notion of causal influence in the context of operational probabilistic theories. More precisely, we studied under what circumstances we can say that a reversible process involving two (or more) input systems and two (or more) output systems allows for causal influence from one of the inputs to one of the outputs. We started generalising two alternate definitions taken from the literature on quantum information processing. In the general scenario, the two notions turned out to be different, and we adopted as a definition of causal influence the weaker one, taking signalling as a strictly necessary condition for causal influence. We then discussed necessary and sufficient conditions and the related definition of T -process of a reversible transformation U relative to one of its input systems A. A special instance of T -process was used for quantum cellular automata in Ref. [32] for the purpose of decomposing a special family of QCAs in local gates.
While in Quantum theory the two notions coincide, we discussed the case of classical theory, which provides an instructive example of the case where nosignalling is not sufficient for no-causal influence. The analysis allowed us to identify an intermediate condition which is necessary for no-causal influence and sufficient for no-signalling, which can be summarised by requiring that the process has to have the structure of a channel with memory. While in the classical case this condition coincides with no-signalling, and is thus strictly weaker than no-causal influence, at the time of writing we do not have examples of theories where the no-signalling condition is strictly weaker than the memory channel structure.
We also identified a feature of a theory is sufficient for the three conditions to become equivalent: this feature was introduced as no-interaction without disturbance. Its failure in the case of a reversible process determines causal influence without signalling. While the relation of no interaction without disturbance with no information without disturbance [25] is beyond the scope of the present paper, we plan to study it in the near future.
Interestingly, the example that we spotted of a reversible process that allows for interaction without disturbance is the classical controlled-not. This gate allows indeed the agent that controls the target to copy the content of the control without affecting it. We deem such an intervention to have a causal influence on the control, even if it does not affect its state. Indeed, it is clear that one cannot copy the state of a system without an effective interaction. Secondarily, we claim that e.g. copying information from a system has a causal influence on it, even though its local state might be unaffected, as one can figure out thinking of many examples in everyday life. The influence consists in creating correlations between the system and its environment, that were absent before occurrence of the interaction.
The definition of causal influence that we propose here has a consequence on the notion of neighbourhood of a cell in a cellular automaton. Indeed, while the definition based on signalling causes puzzling questions in the case of classical cellular automatasuch as seemingly local cellular automata whose inverse is non local-a definition based on causal influence sheds some light on the mentioned issues, showing that the actual neighbourhood can be much larger than the signalling neighbourhood. This phenomenon lies at the basis of the effect that was noticed in Ref. [33], where the "quantised" version of a classical cellular automaton was observed to produce a "speedup" in the propagation of information.
In the perspective of our definition, the effect is due to the fact that the process of "quantising" a classical gate turns its neighbourhood of causal influencewhich might be strictly larger than its signalling neighbourhood-into the neighbourhood of the quantised version (in the quantum case there is no distinction between the two neighbourhoods). As an example, the quantum controlled-not exhibits a kickback on the control, highlighting that the latter is subject to a causal influence from the target.
Proof. By definition ofT A (U ), one has where for the third equality we used property 3d of parallel composition in section 2.