Interference as an information-theoretic game

The double slit experiment provides a clear demarcation between classical and quantum theory, while multi-slit experiments demarcate quantum and higher order interference theories. In this work we show that these experiments pertain to a broader class of processes, which can be formulated as information-processing tasks, providing a clear cut between classical, quantum and higher-order theories. The tasks involve two parties and communication between them with the goal of winning certain parity games. We show that the order of interference is in in one-to-one correspondence with the parity order of these games. Furthermore, we prove the order of interference to be additive under composition of systems both in classical and quantum theory. The latter result can be used as a (semi)device-independent witness of the number of particles in the quantum setting. Finally, we extend our game formulation within the generalized probabilistic framework and prove that tomographic locality implies the additivity of the order of interference under composition. These results can be important for the identification of the information-theoretic principles behind second-order interference in quantum theory.


I. INTRODUCTION
As Richard Feynman famously put it, "the double slit experiment is absolutely impossible to explain in any classical way and has in it the heart of quantum mechanics. In reality, it contains the only mystery" [1]. Indeed, the most common way of introducing quantum theory and its basic tenets is via the double-slit experiment, in which a single particle is shown to produce an interference pattern when sent through two slits, in contrast to classical theory in which this effect is missing. Much later, Sorkin [2] analyzed multi-slit experiments (i.e. generalizations of the double slit experiment to three or more slits) and noticed that quantum mechanics exhibits second-order interference only, meaning that any measurement pattern produced by a quantum system is reducible to the combination of double-slit interferences. The latter served as a motivation for introducing higher-order interference theories which are defined with respect to the order of interference produced in generalized multi-slit experiments [3]. Such theories are usually formulated within the framework of generalized probabilistic theories (GPT-s) [4,5], with quantum theory being only a particular example. This further motivated the search for a set of intuitive physical principles which could explain why Nature should exhibit at most second-order interference [6][7][8][9]. In this work we study higher-order interference from the information-theoretic perspective. Firstly, we reconsider the definitions of multi-slit experiments and Sorkin's classification and notice that: (a) the framework is defined only for single particles (or single systems), (b) the interference order is defined with respect to simple operations of blocking/opening the slits, and (c) the (final) measurement refers exclusively to intensity measurements (average number of * sebastian.horvat@univie.ac.at † borivoje.dakic@univie.ac.at particles at a particular location on the screen). We proceed by generalizing this setting to an arbitrary number of particles (systems), arbitrary set of (local) operations and arbitrary final measurements. The framework for higher-order interference is formulated as an information-theoretic game where we analyse generic scenarios involving classical and quantum systems. Our generalization involves many particles (systems) and their ability to win certain parity games. The order of these games and their winning probabilities are directly related to the order of interference (in Sorkin's classification) and to the number of systems involved. This generalization offers a shift in the perspective: higher order interference theories are not defined by which phenomena are allowed (e.g. by the structure of interference patterns), but by which tasks can or cannot be accomplished within the theory (i.e. their information-processing capacity). Furthermore, our study provides a direct relation between the order of interference and number of systems used in the protocol, and can thus provide a (semi) device-independent way of witnessing the number of particles (systems) present in the process.
In the final section we provide the information-theoretic formulation within the GPT formalism and we study how the order of interference behaves under the composition of systems. We construct a lower bound on the interference order which is additive under composition. In classical and quantum theory, this lower bound coincides with the upper bound, which in turn shows an interesting connection between the order of interference and the composition law. Finally, we prove that in generic local-tomographic theories [10], with the restriction to single-system operations, the order of interference is additive. Together with an information-theoretic perspective, our findings can be seen in light of paving an alternative way towards understanding the physical principles behind the order of interference of quantum theory: we suspect that an important clue might be provided by the composition of Figure 1. Alice and Bob are located at opposite sides of a pierced plate with two slits, each of which can be either open or blocked depending on Alice's input bits. Alice sends her particle through the slits towards Bob, who, upon receiving the particle, generates an output bit b.
systems and by the tensor-product structure. The latter would thus contribute to the operational and physical understanding of quantum theory [10][11][12][13] and provide indications for potential developments of post-quantum theories. Moreover, our work deepens the connection between interference and information processing which has already been alluded to in various contexts involving two-way communication [14][15][16], information speed [17], quantum acausal processes [18], superposition of orders [19] and directions [20], quantum combs [21], quantum switch [22] and quantum causal models [23].

II. INFORMATION-THEORETIC FORMULATION: PARITY GAMES
In the standard double-slit experiment a particle is sent on a plate pierced by two parallel slits. After passing through the plate, the particle can be detected on a screen. Each slit can be either open (which we denote with 0), or blocked (which we denote with 1). The figure of merit is the interference term where P x1x2 denotes the probability of detecting the particle at a point on the screen given that the slits are in states x 1 and x 2 . Notice that we explicitly included the term which corresponds to the situation in which both slits are closed, despite it being necessarily zero (i.e. P 11 = 0). Classical mechanics predicts I 2 = 0, while quantum theory allows the particle to be in spatial superposition, thereby enabling the possibility of generating a non-vanishing I 2 .
In order to reformulate this experiment as an informationtheoretic task, let us consider the scenario involving two parties, Alice and Bob, located at opposite sides of the plate, as shown in Figure 1. Alice possesses a single particle that she can send towards Bob and has control of the slits, i.e. she can decide whether to block them or not. On the other side, Bob receives (or not) the particle, performs an arbitrary measurement and outputs a bit b ∈ {0, 1}. We introduce the redefined second order interference term as follows where P (b|x 1 x 2 ) is the probability that Bob outputs b given that the two slits are in states x 1 and x 2 . Notice that the term corresponding to both slits being closed can now be non-zero, since Bob's measurement is arbitrary, i.e. does not necessarily refer to measuring the probability of the particle inflicting on a point of the screen (intensity measurement). The probability P (b|x 1 x 2 ) is thus a generalization of P x1x2 from (1) and reduces to the latter only in the case in which Bob performs the intensity measurement. The redefined interference termĨ 2 measures the probability of successfully accomplishing the following task (parity game): (a) in each run, Alice prepares the slits in state x ≡ {x 1 , x 2 } and sends her particle through the slits towards Bob; (b) in order to complete the task, Bob must output the parity Bob does not have any prior information about Alice's inputs, which is why a uniform average is taken. Notice that expression (2) is formulated in a deviceindependent way [24], relating only inputs and outputs, without any mention of their underlying physical realization. It is therefore natural to generalize the scenario by replacing slit operations with generic black boxes, which implement arbitrary local operations depending on their inputs. Alice can thus perform any operation (e.g. in quantum theory, general completely-positive maps): the only restriction stems from the operations being local. Moreover, instead of constraining Alice to sending only single particles (as it is also the case for standard interference experiments), we can generalize her resources to an arbitrary number of systems and analyse how the winning probability depends on the number of systems used in the process (as we will see in the next section, the game can be won even in classical theory, by using two particles). Additionally, the particles/systems sent by Alice can have any internal structure (e.g. spin) that can be potentially accessed by the black boxes.
Analogously to the generalization of the standard double-slit experiment to multi-slit experiments, we can further generalize the scenario to an arbitrary number of boxes m. In this case, Alice sends her resources, which consist of k systems, towards Bob, whose task is to output the overall parity of the boxes' inputs, as depicted in Figure 2. The figure of merit is thusĨ where {x 1 , ..., x m } are input bits encoded in the m boxes and s x ≡ ⊕ m i=1 x i is the overall parity. In the case in which Alice is constrained to sending only single particles, the boxes are implemented as slits and Bob's final operation consists of intensity measurements, the generalized higher-order interference termĨ m (k) reduces to the standard higher-order term defined in [2] (up to normalization): where x j = 0(1) corresponds to j-th slit being open(closed) and P x is the probability of detecting the particle on the screen given that the slits are in state x.
Juxtaposed to the standard definition of m-th order interference theories involving the structure of interference patterns produced by single particles in m-slit experiments, our information-theoretic extension provides a definition in terms of the probability of winning the parity game involving m boxes by using a finite amount of resources. In addition, the latter formulation enables an analysis of the relation of the order of interference and the number of systems involved in the process (e.g., as we will see in the next section, the m-th order parity game can be won using m classical particles). We therefore introduce the characterization of higher-order theories in terms of functions n(k), where n refers to the maximum order of interference that can be exhibited using k systems, i.e.Ĩ m=n(k) (k) = 0 andĨ m>n(k) (k) = 0.
To recapitulate, the modified multi-slit experiment consists of the following generalizations: (a) instead of one particle, Alice can possess arbitrarily many particles/systems of arbitrary internal structure, (b) the slits are replaced by black boxes which implement generic local operations depending on Alice's input bits, (c) the screen is replaced by Bob who is supposed to generate an output corresponding to the overall parity of the inputs encoded by Alice. Finally, we can generalize Alice's input bits {x 1 ..., x m } to arbitrary dits, i.e. elements of a set with cardinality d. In this case we introduce a class of games in which Bob is supposed to decode one of the weighted sums modulo d of the inputs, i.e. s (d) Given that Alice is using k systems, the generalized interference terms are theñ which are defined for all reversible functions f that map dits into dits. Operationally, these functions take into account the potential relabelling of Bob's outputs. Notice that for d = 2 there was no need for specifying this, since relabelling Bob's output in Eq. (3) introduces only a minus sign. The games defined in (5) offer a definition of higher order interference theories for arbitrary d. Namely, we define an n(k)-theory to satisfy the following: (i) all processes pro-duceĨ m>n(k), ν,f (k) = 0 for all f and all ν, and (ii) for each dit-string ν, there exists at least one process that pro-ducesĨ Notice that the (non)existence of the required process for a specific outputs' labelling fixed by f implies the (non)existence of the analogous process for any other labelling f (since Bob can always relabel his outcomes independently of the process). Therefore, throughout this manuscript, we will only construct proofs of existence of processes which exhibitĨ (d) m=n(k), ν,f (k) = 0 for f fixed to be the identity map (i.e. no final relabelling) and we will thereby omit the index f . Moreover, whenever we omit the index ν, we refer to the unit dit-string, i.e. ν i = 1, ∀i. Therefore, the termĨ (d) m (k) will correspond to the specific game in which Bob's goal is to output s x = ( i x i )mod d. As we will show later, all conclusions regarding the order of interference of classical and quantum theory hold for arbitrary d ≥ 2.
Winning the parity games can be seen as a truly "global" task, in the sense that Bob needs to produce an output that depends on all the local inputs in each run. For suppose that Alice manages to communicate to Bob all but one input, say the jth one: in this case, the overall modulo s x = ( i x i )mod d is completely unknown to Bob. The same conclusion holds if Alice omits to send more inputs or even if she does not send any information to Bob. Therefore, the interference term I (d) m, ν,f (k) is zero irrespectively of whether Bob lacks knowledge about one or about all inputs. In the subsequent sections, we are going to focus on classical and quantum theory and investigate their power to win the parity games. We will show that the order of interference n(k) satisfies the relation where n(1) is the order of single-system interference (for classical theory, n(1) = 1, while for quantum theory, n(1) = 2) and n(k) is the interference order achievable using k systems. Moreover, we will dwell on possible generalizations of the latter relation to generic higher-order-interference theories within the GPT framework. Before proceeding further, we will first show an important mathematical property that relates the order of interference to the algebraic order of the probability distributions P (b| x).

III. ALGEBRAIC ORDER OF THE PROBABILITY DISTRIBUTIONS
In this section we are going to show that any distribution which does not exhibit higher than n-th order interference can be written as a linear combination of functions of at most n different inputs, or, in other words, that the algebraic order of P (b| x) is at most n. Here we will consider the d = 2 case, while the general case is addressed in Appendix A. Let us consider the parity game involving m boxes and assume that Alice uses one system of order n < m. By definition, this means that the interference terms (3) vanish for all parity games involving more than n boxes, which can be written in a concise form as follows: For compactness, we introduced the m-component bit string σ, that specifies which interference term the latter equation refers to: if σ j = 1 for j = i 1 , ..., i l , the equation states that the order of interference involving boxes i 1 , ..., i l is equal to zero (l can be any integer between n + 1 and m). Let us regard P (0| x) as one component of a vector P in a 2 m -dimensional vector space formed by the tensor product of m two-dimensional spaces, i.e. P (0| x) = e x1 ⊗ ... ⊗ e xm · P , where e xj =0,1 span the j-th two-dimensional space. Equations (7) then imply: where we introduced the rotated vectors The terms λ σ are components of vector P in the new basis spanned by the rotated vectors f σj . Equation (8) then states that the components λ σ are zero if j σ j > n. The probabilities P (0| x) can thus be expressed as where we used e xi · f σi = 2 −1/2 (−1) σixi . Therefore, P (0| x) is a linear combination of functions of at most n different inputs: where we introduced the functions and the coefficients (13) Next, in order to tackle the problem for arbitrary d, for each set of generalized interference terms where ω d = e i2π/d is the d-th root of unity. For the d = 2 case, the dual terms coincide with the ones defined in terms of the game formulation (3); however, this is not the case for d > 2. Nevertheless, in Appendix A we show that the generalized interference termĨ m, ν,f from (5) vanishes for all relabellings f and all dit-strings ν if and only ifJ (d) m,ν,b = 0, for all b = 0, ..., d − 1 and for all ν. This allows us to characterize the order of interference by analysing the behaviour of its dual, which can often be more tractable. Indeed, by this method we show in Appendix B that the conclusion derived in (12) holds for arbitrary d, i.e. that any distribution that exhibits at most m-th order interference can be written as a linear sum of functions which depend on at most m dits.

IV. CLASSICAL RESOURCES
Now we are going to analyse the generalized interference terms achievable within classical theory. Single systems. For a start, let us focus on the simplest scenario, i.e. the generalization of the double slit experiment, in which Alice sends a single particle through two boxes towards Bob (for now we stick to binary inputs, i.e. d = 2). Recall that the two boxes can implement arbitrary local operations (not only blocking/opening the slits) labelled by x 1 and x 2 . If the particle is classical, i.e. has a definite trajectory, Bob's output can depend on the state of at most one box and can be decomposed as follows: where q 1,2 are probabilities and P 1,2 (b|x 1,2 ) each depend on the state of at most one box. The modified interference term therefore vanishes: Intuitively, the knowledge about the state of one box does not increase the probability of correctly guessing the inputs' parity with respect to a random guess. The same reasoning holds for arbitrary d, implying thatĨ (d) 2 (k = 1) = 0 for one classical particle.
Multiple systems. Now, what if Alice possesses two classical particles? Let us for the moment assume that the boxes are implemented by slits (as in the original double-slit experiment) and that Alice in each run sends deterministically one particle towards the first slit and the other towards the second slit. If the parity of the slits' states is x 1 ⊕ x 2 = 0, then Bob either receives both particles on his side or he receives no particles at all; alternatively, if the parity is 1, Bob receives exactly one particle. Thus, by simply counting the number of received particles, Bob can in each run determine the inputs' parity and can perfectly accomplish the required task, thereby generatingĨ (2) 2 (k = 2) = 1/2. On the other hand, the standard interference definition (1) remains null also for two particles, since the average number of particles received by Bob (or inflicted on the screen) is equal to 1 regardless of the inputs' parity. This shows the difference between the standard formulation and our game formulation, as the latter allows Bob to measure coincidences, and not only the average particle number (i.e. intensity). We proceed by analysing the fully general scenario with m boxes that implement arbitrary local transformations and assume that Alice's resources consist of k classical objects, be it particles, conglomerates of particles or localized wave packets. For this reason, it is clear that Alice's resources cannot interact with more than k boxes, which means that Bob's output can depend on at most k inputs: if k < m, Bob's output is equivalent to a random guess andĨ  m, ν,f . Therefore, classical theory satisfies n(k) = k, meaning that k classical systems can exhibit at most k-th order interference.

V. QUANTUM RESOURCES
Contrasted to classical mechanics, quantum theory allows spatial superpositions of physical objects, which can generate a non-zero interference term I 2 , even with a single particle. On the other hand, higher order interference terms I j>2 defined in multi-slit experiments remain null even in quantum theory (see for instance [2]). In this section we show that analogous statements hold for the generalized interference termsĨ (d) m, ν,f (k = 1) and provide an extension to more particles (systems). Let us first look into the simplest case, i.e. one particle and two boxes.

V.I. Two boxes
We start by considering the case d = 2. Let us suppose that Alice possesses a single quantum particle and sends it in spatial superposition towards the two boxes, which implement binary inputs. The quantum state is given by where |1 and |2 are states corresponding to the two trajectories. Next, suppose that each box interacts with the particle in a simple way, e.g. by providing a local phase-shift φ i = x i π.
After passing through the boxes, the state is Therefore, Bob receives the particle in the following states which are orthogonal and thus perfectly distinguishable, thereby enabling Bob to deterministically decode the parity x 1 ⊕ x 2 and to produceĨ 2 (k = 1) = 1/2. The previous result holds for binary inputs; now suppose that the parties are playing the modulo game (5) specified by dit string ν = (ν 1 , ν 2 ), i.e. Bob's goal is to output (ν 1 x 1 + ν 2 x 2 )mod d (we assume there is no final relabelling, i.e. that f is the identity; as we already argued, all other cases follow automatically). The players can employ the following strategy. For inputs (x 1 , x 2 ) ∈ {(0, 0), (0, 1), (1, 0)}, Alice uses the same encoding as in the binary case (i.e. by applying local phases x i π); for any other combination of inputs, Alice does not send the particle. On the other side, Bob performs the same measurement as in the binary case. If he receives the particle, he can infer the parity x 1 ⊕ x 2 with unit probability. If the parity is zero, Bob outputs 0, since ν 1 x 1 + ν 2 x 2 = 0. If the parity is one, then the desired output is either ν 1 or ν 2 and Bob can thus produce the correct outcome with 50% probability. On the other hand, if he does not receive the particle, he outputs a random dit. The interference term is theñ We have therefore shown that a single quantum particle can be used to exhibit second order interference for any d.

V.II. General case
Here we analyse the fully general scenario involving arbitrary local operations and arbitrary measurements. We start by constraining Alice's resources to a single quantum system and afterwards we extend our considerations to the case of multiple systems. Single systems. For a start, let us fix the resources to one quantum system, without restricting its internal degrees of freedom. The resource can thus be an electron, an atom or any localized quantum system that can be prepared in coherent spatial superposition using a beam splitter or some more sophisticated mechanism. The local operations implemented by the m boxes are completely arbitrary, examples being local unitary operators and CP maps acting on the internal degrees of freedom of the system. The most general state Alice can prepare is thus given by where {|i , i = 1, ..., m} denote states representing defined paths directed towards the m boxes, while {|φ k } span the Hilbert space of the internal degrees of freedom of the system. The matrix elements c ijkl need to satisfy certain conditions in order for ρ 0 to be a legitimate quantum state, but we will not specify them since this constraint is not relevant for what follows. After the system passes through the boxes, the state is transformed to where M xi,xj {...} are arbitrary local maps, i.e. they depend only on their respective inputs x i and x j . For instance, in case of local unitary operations, they are given by To simplify the expression, let us introduce the set of ten- Next, let us suppose that Alice and Bob are playing the modulo game (5) specified by a dit-string ν and with no final relabelling (all other cases follow automatically). Notice that for m > 2 and for any α ∈ {1, ..., d − 1}, the following property holds where ω d ≡ e i2π/d . This is so because the tensors C ij depend on at most two inputs and because d−1 x k =0 (ω d ) αν k x k = 0. Let us define the average taken over states with equal modulo as follows As we show in Appendix C, equations (24) then imply ρ (S) = ρ (S ) ≡ρ, ∀S, S = 0, ..., d − 1.
Since all average states ρ (S) are equal, Bob cannot distinguish them by any means. Formally, Bob performs a measurement consisting of d outcomes and represented by a generic POVM {Π 1 , ..., Π d }. If Alice sends k = 1 quantum systems, the generalized interference term specified by the dit-string ν necessarily vanishes for m > 2: A resource consisting of one quantum system is enough to produce non-vanishing second order interference (for binary inputs, it can even raise the interference term to its maximum possible value); on the other hand, for m > 2, there is no difference between sending one quantum system and sending no resource at all. This drastic difference can be traced down to (24), where the tensors C ij couple at most two inputs x i and x j . This is because quantum states are described by (1, 1) tensors, i.e. density matrices. The latter constraint arises through the Born rule, which essentially sets quantum interference to the second order [3]. Multiple systems. Next, let us suppose that Alice's resources consist of more than one quantum system, say, in general, k of them. We assume that the systems are distinguishable and thus described by the tensor product of the single-system Hilbert spaces 1 . The most general state prepared by Alice is then where i is short for (i 1 , ..., i k ) and analogously for the other indices. The vectors |n ip , ∀i p = 1, ..., m span the spatial Hilbert space of the p-th system, while |φ (p) rp , ∀r p span its internal degrees of freedom (the dimensionality of which is arbitrary). After passing through the boxes, the state is transformed to where {C ij } are tensors depending on at most 2k inputs (defined in the same fashion as in the single-system case, see Eq. (23)). The crucial difference with respect to the single system case is that the expressions (24) are now valid only if m > 2k. Analogously to the single-system case, one comes to the following conclusioñ We therefore showed that k quantum systems cannot produce more than 2k-th order interference. Now we are going to show that k systems can produce 2k-th order interference. For binary inputs (i.e. d = 2), Alice and Bob can partition the 2k boxes into pairs {(x 1 , x 2 ), ..., (x 2k−1 , x 2k )} and for each pair use the protocol described in Section V V.I; Bob can thus perfectly decode the parity of each pair, which enables him to win the game with unit probability. For d > 2, the proof is not so straightforward, since the protocol involving one particle and two boxes does not raise the interference term to its maximum value as it does for binary inputs. Nevertheless, in Appendix D we derive a general result which shows that k systems of single-system orders {n 1 , ..., n k } can produce ( k i=1 n i )-th order interference, for any d ≥ 2 (this statement is independent of the underlying physical theory). Therefore, k quantum systems (i.e. systems of order two) can produce 2k-th order interference. According to our classification, the latter results implies that quantum theory satisfies n(k) = 2k, meaning that k (distinguishable) quantum systems can produce at most 2k-th order interference.

VI. HIGHER-ORDER INTERFERENCE THEORIES
In the preceding sections we saw that if Alice's resources consist of k classical systems, the interference terms I (d) m, ν,f (k) vanish for m > k (for any d). On the other hand, by using k quantum systems, one can produce at most 2k-th order interference, essentially due to the locality of the encoding operations and Born's rule. It is tempting to conjecture that the analogous statement holds for generic higher order theories, i.e. n(k) = kn(1), where n(1) is the single-system order of interference.

VI.I. Single systems
Our information-theoretic formulation given in Section II can be studied within the GPT framework, as definition (5) does not contain any information about the structure of the underlying theory. However, the subtle point here is to define the notion of local operations specified by the action of the boxes. In what follows, we will provide this definition for single systems. The latter reduces to the standard GPT definition given in [4], if the set of transformations and measurements are restricted to projectors (i.e. opening/closing the "slits" and detecting the particle). We adopt the standard formalism of GPTs involving states, transformations and effects [4]. The state space of Alice's system will be denoted with St(A), the set of effects with Eff(A) and the set of transformations with Transf(A). The transformations are linear operators on the state space, mapping states into states, while the effects are linear functionals on the state space that map states into probabilities 2 . We first define the set of "box" effects { e 1 , ..., e m }, where e j produces the probability of "finding the system at box j" (as in the definition provided in [4]). These effects naturally induce the following decomposition of the state space: With this partition, we are ready to define the set of transformations generated by the boxes. We denote the set of transformations representing the actions of the i-th box with T In order to formalize the notion of locality of the operations, it is reasonable to assume that the action of box i leaves invariant all states localized at boxes j = i, i.e. ∀ s ∈ S j =i : T (j) xj s = s. Furthermore, we assume that the ordering of the boxes' actions is irrelevant, i.e. T (i) In quantum (field) theory, if "locality" of the boxes' actions refers to real space, the latter condition is ensured via the axiom of microcausality. A similar condition within the GPT framework is known as the "branch locality" assumption [25]. The total set of transformations generated by the boxes is thus (32) The interference term is defined with respect to the following process: Alice prepares an arbitrary state s ∈ St(A) and sends the system through the boxes which implement the transformation T x ∈ τ ; Bob receives the system and performs an arbitrary measurement with d outcomes represented by the following effects F = f 1 , ..., f d ∈ Eff(A), j f j = u , where u is the unit effect. Higher order interference theories for single systems are then defined as follows.
Definition. We say that the theory is of n-th order if and only if it satisfies the following: (a) ∀d ≥ 2, ∀ ν, ∀f , the following holds: (34)

VI.II. Multiple systems
In Appendix D we constructed a proof that k generic systems of orders {n 1 , ..., n k } can exhibit i n i -th order interference 3 , for any d ≥ 2. The latter provides a lower bound performed on the transformed state T s. 3 Under the condition that the composition of systems is well defined. which is additive in the interference order. On the other hand, we saw that in classical and quantum theory, this lower bound coincides with the upper bound, i.e. these theories satisfy n(k) = kn (1). In what follows, we will discuss the extendability of this relation to a broader class of theories. Let us start by assuming the principle of Local Tomography [10], which states that a physical state is fully characterized by the marginals of its subsystems and correlations among thereof. Under this assumption, the state space of a composite system is isomorphic to the tensor product of the subsystems' state spaces (as it holds for example in quantum and classical theory [10]). We label the k subsystems with A 1 , ..., A k ; the composite state space is then The space of effects is a subset of the single-systems' spaces of effects, i.e. Eff(A) ⊆ Eff(A 1 ) ⊗ ... ⊗ Eff(A k ) 4 . On the other hand, the space of transformations is generally larger than the tensor product of the subsystems' sets of transformations. However, here we will focus exclusively on boxes which act independently on the single systems (thereby excluding e.g. entangling gates in quantum theory). We want to show that under these assumptions, k systems of orders {n 1 , ..., n k } cannot exhibit more than i n i -th order interference. Let us first focus on binary inputs, i.e. d = 2. An arbitrary state s ∈ St(A) prepared by Alice can be written as a linear combination of the tensor product of single system states: As we already said, we assume that the set of transformations is generated by the single system transformations defined in Section VI VI.I, i.e. τ = T (1) x , ∀ x , where the locality and no-signaling conditions hold for transformations acting on each subsystem. For d = 2, the m-th order interference term for an arbitrary process is then: where P ip are probabilities arising from the single system processes involving p-th system (for simplicity we omitted indices i, j). Therefore, we see that due to Local Tomography and the restriction to single-system operations, the interference term decouples into a linear combination of products of single-system processes. As it was shown in Section III, the probability distribution pertaining to any process 4 Under the no-restriction hypothesis [26], the space of effects is isomorphic to the tensor product structure; however, this assumption is not necessary in our proof.
involving a system of order n p can be written as a linear combination of functions of at most n p inputs: Therefore, if N ≡ p n p < m, the interference term necessarily vanishes: where we introduced for simplicity the coefficients a i1,...,i N and functions g(x i1 , ..., x i N ), the exact forms of which are of no relevance. In order to show that the latter results holds for arbitrary d, we just need to take a look at the dual interference terms defined in Section III. Given our assumptions, one arrives to the generalization of (35): By assumption, the probabilities P ip pertaining to p-th system do not exhibit more than n p -th order interference and can thus be expressed as a linear combination of functions depending on at most n p inputs, as we showed in Appendix B. Therefore, if N ≡ p n p < m, thenJ (d) m, ν,b (k) vanishes for all ν, b. Since the dual formulation is equivalent to the game formulation (see Appendix A), this implies that the k systems cannot be used to achieve more than j n j -th order interference for any d.
To summarize, under the assumptions of (i) Local Tomography, and (ii) independence of single-system transformations, we proved that a system composed of k systems of orders (n 1 , ..., n k ) is a ( j n j )-th order system. This holds trivially in classical theory, while in quantum theory we proved a stronger statement: additivity follows without assumption (ii). It might be the case that this assumption is redundant for any GPT, since the multipartite operations are still restricted to being local; however, in order to prove this, one ought to introduce a generalization of Definition VI VI.I, which we leave for future considerations.

VII. CONCLUSION AND OUTLOOK
In this work we introduced a class of information-theoretic games, which generalize standard multi-slit interference experiments. The order of interference of a theory is then seen as the (im)possibility of accomplishing certain informationprocessing tasks using finite resources. These games essentially characterize how much information can be decoded from a physical system given that the information was encoded in a global property (parity or modulo of the inputs) of local pieces of information (local inputs). We showed that within quantum theory, the order of interference of k systems is 2k; it would be interesting to inspect potential connections of this result to superdense coding, which states that k qubits can be used to send at most 2k bits. Moreover, the game formulation can provide a (semi) device-independent witness of the particle number. So far there have been several attempts at explaining the order of interference of quantum theory; however, a physically intuitive explanation has not yet been obtained. The reason for this might be that the sole question is leading us in the wrong direction. Instead of asking ourselves why does quantum theory behave in a particular way in multi-slit experiments (or GPT generalizations thereof), we could ask why does quantum theory restrict the amount of globally-encoded information (parity/modulo of locally encoded bits/dits) that can be retrieved from a system. Moreover, what is the relation between the (im)possibility of such a decoding and the number of systems used for encoding the information? Is there any physical argument for why interference should be additive under composition (e.g. violation of the no-signaling principle)? Ultimately, how does one even define the notion of number of systems in a device-independent scenario? We believe that our work provides a framework fit for potentially finding answers to these and similar questions, which can consequently lead us to a fuller understanding of quantum and post-quantum theories.
is equivalent to its dual formulation in terms of the dual interference terms where ω d = e i2π/d is the d-th root of unity. For convenience, let us first introduce the dual terms with an added index α ∈ {1, ..., d − 1}: We will show that for any dit-string ν, the following equivalence holds: Since the order of interference is defined as the vanishing of the interference terms (A1) for all ν, (A4) would imply that the dual formulation of interference (A2) is equivalent to the game formulation (A1) (this is so because if the RHS of equivalence (A4) holds for all ν, then the dual conditions (A2) also hold, since the index α becomes redundant). In what follows, we will prove that (A4) does hold indeed. In order to simplify the notation, let us introduce the d × d matrix P , whose elements are defined as The normalization of probabilities implies that P is a stochastic matrix, i.e. b P bs = 1, ∀s.
The game formulation equations on the left side of the equivalence in (A4) can be written as which we rewrite succintly as where Π f ≡ l |f (s) s| ranges over all d-dimensional permutation matrices (the most general finite dimensional reversible transformations are indeed permutations).
On the other hand, the dual conditions on the right side of equivalence (A4) assume the following form where F is the d-dimensional Fourier matrix with elements We will first show that the dual conditions (A9) imply the information-theoretic conditions (A8). Assuming that the the dual conditions (A9) hold, we evaluate the trace The first step follows from the unitarity of F and the ciclicity of the trace, the third step is due to the dual conditions (A9), and the last step follows from (F † Π f ) 0l = 1/ √ d. Therefore, conditions (A9) imply (A8). In order to prove the converse statement, let us introduce the following new basis β = {|E , |e 1 , ..., |e d−1 }, where |E has the following form and the remaining vectors span the orthogonal subspace (they can for instance be chosen as the rows/columns of the Fourier matrix). The normalization conditions (A6) can then be written as which implies that the matrix P in the new basis has the following form where q j are arbitrary coefficients andĒ is a matrix that has support only on the subspace orthogonal to |E . In the new basis β, all permutation matrices have the following form where ∆ f is again a matrix with no support on |E . The latter follows from the fact that the representation of the permutation group that we are using is reducible to the direct sum of a one-dimensional representation (spanned by the vector |E which is invariant under all permutations) and a (d − 1)-dimensional irreducible representation (here given by ∆ f ). Assuming that the game-conditions (A8) hold, we obtain the following 1 = Tr[Π f P ] = 1 + Tr ∆ fĒ → Tr ∆ fĒ = 0, ∀f.
Since the matrices ∆ f provide an irreducible representation of the permutation group, the Burnside Theorem [27] implies that they span the set of all (d − 1)-dimensional matrices; therefore, Equation (A16) implies thatĒ is necessarily zero. The matrix P is thus where we introduced the vector |q ≡ |E + j q j |e j . Therefore, the sought matrix P F is where we used d−1 k=0 (ω d ) k = 0 and {|k ; k = 0, ..., d − 1} are the original basis vectors. Hence, equivalence (A4) holds, and thus the game formulation (A1) is equivalent to the dual formulation (A2). Except for being an interesting mathematical and conceptual result, the exact equivalence between these two formulations will be useful for showing that any distribution that exhibits at most m-th order interference can be written as a linear combination of functions with at most m inputs. This will consequently be the necessary ingredient for proving that Local Tomography implies the additivity of the order of interference under the composition of systems.
For compactness, we allow the dit string ν to range over all dit-strings with less than (m − n) null components. This notation specifies which interference terms the equation refers to: if ν j = 0 for j = i 1 , ..., i l , the equation states that the order of interference involving boxes i 1 , ..., i l is equal to zero (l can be any integer between n + 1 and m). Let us regard P (b| x) as one component of a vector P b in a d m -dimensional vector space formed by the tensor product of m ddimensional spaces, i.e. P (b| x) = m i=1 e xi · P b , where e xj , x j = 0, ..., d − 1 span the j-th d-dimensional space. Equations (B1) then imply:J where we introduced the rotated vectors (rows/columns of the d-dimensional Fourier matrix) and components λ ν,b in the corresponding basis. Equation (B2) states that the components λ ν,b vanish for all ν with more than n non-zero components. The probabilities P (b| x) can thus be expressed as where we used e xi · f νi = 1 √ d (ω d ) νixi . Since the sum runs only over strings ν with at most n non-zero components, P (b| x) can be written as a function of at most n inputs, for every b. Therefore we have proved that the n-th order interference dual conditions (B1), or equivalently, the information-theoretic formulation (5), provide a set of necessary and sufficient conditions for a distribution to be decomposable into a linear combination of at most n inputs. Hence, the algebraic order of the distribution P (b| x) is at most n.