How dynamics constrains probabilities in general probabilistic theories

We introduce a general framework for analysing general probabilistic theories, which emphasises the distinction between the dynamical and probabilistic structures of a system. The dynamical structure is the set of pure states together with the action of the reversible dynamics, whilst the probabilistic structure determines the measurements and the outcome probabilities. For transitive dynamical structures whose dynamical group and stabiliser subgroup form a Gelfand pair we show that all probabilistic structures are rigid (cannot be infinitesimally deformed) and are in one-to-one correspondence with the spherical representations of the dynamical group. We apply our methods to classify all probabilistic structures when the dynamical structure is that of complex Grassmann manifolds acted on by the unitary group. This is a generalisation of quantum theory where the pure states, instead of being represented by one-dimensional subspaces of a complex vector space, are represented by subspaces of a fixed dimension larger than one. We also show that systems with compact two-point homogeneous dynamical structures (i.e. every pair of pure states with a given distance can be reversibly transformed to any other pair of pure states with the same distance), which include systems corresponding to Euclidean Jordan Algebras, all have rigid probabilistic structures.

The aim of this paper is to provide tools to systematically explore the space of non-classical systems. Rather than generating examples of non-classical systems we can give full classifications of families of non-classical systems which share a common dynamical structure (pure states and reversible dynamics) but different probabilistic structures (measurements and measurement outcome probabilities); as done in [51] for systems which share the dynamical structure of quantum systems. We can thus obtain a richer picture of the space of non-classical systems, of which quantum systems are just one example.
We provide a general framework for convex systems and use it to study transitive systems, that is to say systems for which any two pure states are related by a reversible transformation. This is a generalisation of the OPF (outcome probability function) framework of [51][52][53], where the pure states and dynamical group no longer have to be those of quantum theory. We restrict ourselves to systems with reversible dynamics given by finite and compact groups, noting that all the examples of GPT systems mentioned previously are transitive systems with finite or compact dynamical groups. The assumption of transitivity has played an important role in derivations of quantum theory from operational/information theoretic principles, for instance in Hardy's original derivation [14] as well as subsequent derivations by other authors [16][17][18]54]. It is worth mentioning that many derivations of the second law of thermodynamics from more fundamental principles (see for example [55][56][57][58][59]) use as the central premise the reversibility of the underlying dynamics (both in the classical and quantum frameworks). Also, when all the transformations that can be implemented on a system are generated by reversible dynamics, all the achievable states of the system form a transitive space. Hence, there is a connection between transitivity and the second law of thermodynamics.
We show that for a given dynamical structure (pure states and dynamical group) every possible probabilistic structure (measurements and outcome probabilities) is in correspondence with a representation of the dynamical group. Moreover we find necessary and sufficient conditions on the dynamical structure (the dynamical group and subgroup form a Gelfand pair) which make this correspondence one-to-one. We find that certain probabilistic structures cannot be infinitesimally deformed and call these rigid. We show that all dynamical structures which are Gelfand pairs do not have any probabilistic structures which can be infinitesimally deformed. We apply the methods developed to classify generalisations of quantum systems, with pure states given by Grassmann manifolds and unitary dynamics. We introduce the family of systems with compact two point homogeneous dynamical structures and show that they all have rigid probabilistic structures.

Structure of the paper
In Section 2 we introduce the OPF framework used for studying transitive systems and present relevant known results (or slight generalisations thereof). In Section 3 we give the main theorem of this work (the classification theorem), establishing a correspondence between probabilistic structures of transitive systems and group representations, as well as the conditions under which this correspondence is one-to-one. In Section 4 we introduce the notion of deformation of probabilistic structures, and show that the only dynamical structures which admit probabilistic structures which can be infinitesimally deformed are those corresponding to non-Gelfand pairs. We also give an explicit example of deformations of a non-rigid probabilistic structure. In Section 5 we introduce the family of compact dynamical structures which are two point homogeneous and show that they are all rigid. In Section 6 we apply the classification theorem to systems with dynamical structures given by complex Grassmann manifolds (a generalisation of complex projective space). In Section 7 we discuss the results of this paper in light of existing work as well as comment on the implications of new concepts and results of the present work. Lastly we close with some concluding remarks in Section 8. A glossary of notation is given in Appendix A and an introduction to some of the representation theory used in this paper can be found in Appendices B and C.

Single system state spaces
We provide a characterisation of single systems within the GPT framework which emphasises the pure states and reversible dynamics. This will allow us to consider families of systems with the same pure states and reversible dynamics, but different measurements. This is a generalisation of [51] where all systems with the same pure states and reversible dynamics as quantum theory were classified and their informational properties studied. We first describe quantum systems in this framework.

A characterisation of finite dimensional quantum systems
Quantum systems are often characterised directly in terms of mixed states, that is to say their convex representation. States are positive semi-definite operators on a finite dimensional complex space C d equipped with a sesquilinear inner product, transformations are CPTP maps and measurements are associated to POVMs, with the probability of an outcome occuring being given by the usual trace rule. Here we provide a characterisation of quantum systems which separates their dynamical structure from the probabilistic structure. In this characterisation the mixed state representation described above is derived, rather than postulated. Moreover this distinction between dynamical and probabilistic structures will provide us with a way of classifying families of more general systems which share a common dynamical structure. A quantum system S Quant PC d is the complex projective space with elements corresponding to one dimensional subspaces of C d . PU(d) is the projective unitary group constructed from U(d) by taking equivalence classes of unitaries under multiplication by a complex phase: U 0 = e iθ U 1 ⇐⇒ U 0 ∼ = U 1 where U 0 , U 1 ∈ U(d) and θ ∈ R.
We assume that any subset {Q i } n i=1 such that i Q i (ψ) = 1 for all ψ ∈ PC d forms a valid measurement. This implies that a measurement consists of positive semi-definite operatorsQ i such that iQ i = 1. Here I. and II. are the dynamical structure, whilst III. is the probabilistic structure. The mixed state representation (density operators) is derived from the dynamical and probabilistic structures. We observe that the probability assignment (Born rule) is not given in terms of the trace, since this already presumes the structure of mixed states. We now define general non-classical systems in terms of dynamical and probabilistic structures and show how to derive the convex representation.

Dynamical structure
The pure states of a system S form a set X, and the reversible dynamics a group G. The action of G on X is given by a group action ϕ : G × X → X. This gives X the structure of a G-space.
where X is a set, G is a group and ϕ a group action.
In the following we leave ϕ implicit and write gx for ϕ (g, x). An important family of dynamical structures are transitive. A dynamical structure is transitive when for any two pure states x, x ∈ X there exists a transformation g ∈ G such that x = gx. In other words X is the orbit of G acting on an arbitrary x ∈ X.
A central notion to the approach used in this work is that of a stabilizer subgroup (also known as isotropy group) of an element x ∈ X, which is just the subgroup of all transformations in G which leave a point x invariant. We write H x := {g ∈ G : gx = x} for the stabilizer subgroup of a point x ∈ X. For a transitive group action, the stabilizer groups for different points are isomorphic, hence we write H as the stabilizer group.
Given a group G and a subgroup H ⊆ G, we denote by φ G,H the action of G on the set of left cosets G/H. Given a transitive dynamical structure D = (X, G, ϕ) with stabilizer subgroup H we have the following isomorphism of dynamical structures (X, G, ϕ) ∼ = (G/H, G, φ G,H ).
Typically dynamical structures (X, G, ϕ) also have topological (and sometimes differentiable) structure. A topological group G acts on a topological space X when the action ϕ is a continuous function: ϕ : G × X → X (where G × X has the product topology). If H is a subgroup of G then the space G/H of left cosets is a topological space with respect to the quotient topology, which is the finest topology making the quotient map q : G → G/H, q : g → gH continuous.
For transitive dynamical structures (X, G, ϕ) where X is Hausdorff and G is compact the isomorphism (X, G, ϕ) ∼ = (G/H, G, φ G,H ) also involves the topological structure of each component of the triplet [60, Proposition 1.10]. When G is a Lie group the isomorphism (X, G, ϕ) ∼ = (G/H, G, φ G,H ) involves the differentiable structure of each component of the triplet [61,Theorem 20.12].
In the following we restrict our attention to compact dynamical structures (which includes finite dynamical structures as a special case) implying that the isomorphism (X, G, ϕ) ∼ = (G/H, G, φ G,H ) includes the topological and differentiable structures of the dynamical structures considered.
For this reason we use the abbreviation for transitive dynamical structures with stabilizer subgroup H. The case of non-compact dynamical structures is discussed in Section 2.5.

Probabilistic structure
A system is determined by its pure states, dynamics and measurements. Given the dynamical structure we need to specify its probabilistic structure, which characterises the measurements which can be performed on the system.

Definition 2 (Outcome probability function (OPF)
). An outcome of a measurement on a system with pure states X is given by a function f : X → [0, 1], where the probability of the associated outcome f occurring is P (f |x) = f (x).
Definition 5 (Probabilistic structure). The probabilistic structure of a system is the set F X of all outcome probability functions f .
Typically we assume that any set {f 1 , ..., f n , ...} such that i f i = u forms a valid measurement, however this assumption is not necessary. When this assumption does not hold, one needs to supplement the set F X with a specification of which OPFs form a valid measurement. One example of such a specification is the 'finite measurement outcomes' assumption: Definition 6 (Finite measurement outcome assumption). Only finite sets of OPFs {f 1 , ..., f n } such that i f i = u form valid measurements.
The above assumption is sometimes viewed as part of the definition of measurements in an operational framework, since we can never carry out measurements with infinitely many outcomes. We will be making this assumption in the present work.
Operational considerations impose the following constraints on F X : i . F X is closed under taking mixtures: for all f 1 , f 2 ∈ F X and all λ ∈ [0, 1] we have that ii . F X is closed under composition with group transformations: for all f ∈ F X and g ∈ G we have that iii . F X is closed under coarse graining of measurement outcomes: for any pair of outcomes iv . For every f ∈ F X , the complement OPF f c = u − f is also in F X The first constraint implies that F X is a convex set, hence it can be extended to a vector space Closedness under composition with group transformations implies that F X is a G-space. This and the fact that the group action commutes with taking mixtures implies that R[F X ] is a linear representation of G. Closedness under coarse graining of measurement outcomes implies that every F X contains the unit OPF and the existence of the complement guarantees the existence of the 0 OPF. We introduce the following property, though we will not always require it in the present treatment.
Definition 7 (Separability of pure states). A probabilistic structure F X separates pure states when for any two pure states If one does not have this requirement, the probabilistic structure F X = {u} leading to a trivial system for all dynamical structures is valid for example.

Systems, state spaces and associated group representations
The above definitions allow us to formally define a system S X . Definition 8 (System). A system S X is a triple S X = (X, G, F X ), where (X, G) is a dynamical structure and F X is a probabilistic structure.
In the following we briefly outline how the general state space (including mixed states) of a system is derived, both from an operational starting point and directly from the mathematical starting point S X = {X, G, F X }.

Operational derivation of the state space
Operationally for a single system one has access to a preparation device which is wired up sequentially with a transformation and measurement devices. These devices have classical settings (for instance which transformation to apply) and classical readouts (for instance which measurement outcome occurred). In an experiment one collects the statistics for different outcomes given choices of settings. Typically one assumes that statistics are gathered for all possible setting choices, and that the relative frequencies obtained become probabilities as the number of runs tends to infinity. Using these probabilities (which are directly given by the set F in the OPF framework) one derives the convex state space (and effect space) of the system. We refer the reader to [62] about how one can in practice derive a state and effect space from experimental data.

Mathematical derivation of the state space
In this work we will make the assumption of the possibility of state estimation using a finite outcome set (known as 'Possibility of state estimation' in [53]). Definition 9 (Possibility of state estimation using a finite outcome set). The system S = {X, G, F X } is such that the value of a finite number of outcomes f 1 , ..., f n ∈ F X on any ensemble It is shown in Lemma 2 of [53] that this implies that R[F X ] is finite dimensional. Equivalently the convex set of mixed stated is embeddable in a finite dimensional real vector space. We now briefly outline the derivation of the space of mixed state for a system S X = {X, G, F X } under the assumption "Possibility of state estimation using a finite outcome set". First the probability of an outcome f (defined on X) occurring for an ensemble This allows us to define equivalent ensembles.

Definition 10 (Equivalent ensembles). Two ensembles
The mixed states are defined as equivalence classes of ensembles under this equivalence relation. For each state x ∈ X we define the linear functional Ω x : The probability of outcome f on ensemble {(p i , x i )} i can be written as where we define the functional associated to ensemble {(p i , x i )} i as ω = i p i Ω x . Therefore, two ensembles{(p i , x i )} i and {(p j , x j )} j are equivalent if and only if, their corresponding functionals are identical i p i Ω xi = j p j Ω x j . The outcome probabilities P (f |{(p i , x i )} i ) on the space of ensembles uniquely define linear functionals Λ f on the space of mixed states, such that Λ f ·ω = ω(f ) for all mixed states ω. The group action ϕ : X × G → X, naturally extends to the space of mixed states (embedded is the same as Ω x g g − − → Ω g gx ; hence there exists a homomorphism Γ : G → GL(R[F X ] * ). We call this the group representation associated to the system S. This naturally induces a representation Γ * : G → GL(R[F X ]), which is isomorphic to Γ since the representations are unitary and real.
We can summarise the above in the following theorem (fully proven in Appendix D), which is a straightforward generalisation of Result 1 of [51] to arbitrary dynamical structures: Theorem 1 (Result 1 of [51]). For every system S X = {X, G, F X } obeying 'Possibility of state estimation using a finite outcome set' there exists an embedding of S X into a finite dimensional real vector space V ∼ = R[F X ] * and its dual V * given by the following maps: satisfying the following properties: 1. Preservation of dynamical structure: 2. Preservation of probabilistic structure: 3. Uniqueness: The embedding of S X into (V, V * ) given by the maps Ω, Γ, Λ (satisfying all of the above) is unique up to equivalence.
Two embeddings of S X into (V, V * ) given by the maps Ω, Γ, Λ and Ω , Γ , Λ are equivalent if there exists an invertible linear map L : V → V such that: We call the representation Γ of Equation (6) the representation of G associated to the system S X . conv(Ω X ) is the convex hull of the extremal states, which we call state space 1 . A standard representation of the states Ω X is given as a vector of fiducial outcome probabilities. Remark 1. For a system S X = {X, G, F X } with maps Ω, Γ, Λ the vectors Ω X admit of a standard representation in terms of a fiducial outcome set (which is non-unique). A fiducial outcome set such that every other f ∈ F X can be uniquely expressed as for all x ∈ X and c i coefficients in R. In this representation a state Ω x is written as: . . . and ) is a dual vector: One can immediately verify that Λ f · Ω x = f (x).
In general the convex hull of a set of points P will not have that set of points as extremal points, since generically some points in P might lie in the convex hull of other points of P . As such it is not immediate that conv(Ω X ) has extremal points Ω X . The following lemma tells us that this is the case. Let us denote by δ e (C) the extremal points of some convex set C.
This lemma is proven in Appendix D.2, and makes use of the transitivity of the action of G on the pure states X. It follows from the fact that Ω X is a subset of a hypersphere in the affine span of Ω X centred on the maximally mixed state. See also [63,Proposition 2.2] for an equivalent statement of the lemma and proof.
Definition 11 (Tomographically equivalent probabilistic structures). Two probabilistic structures F and F are tomographically equivalent if they yield the same equivalence classes of ensembles (i.e. mixed states).
We note that two systems S X = (X, G, F X ) and S X = (X, G, F X ) with embeddings Ω, Γ, Λ and Ω , Γ , Λ are tomographically equivalent if and only if Ω X and Ω X are affinely isomorphic (i.e. equivalent as convex set). For a given system the asymptotic limit consists of the scenario where all preparation procedures are of n copies of the same state and n tends to infinity. In this case all states (including mixed) become perfectly distinguishable (though this does not lift the degeneracy of equivalent ensembles). We denoteF X the equivalence class of all tomographically equivalent probabilistic structures, hence X,F X can be identified with the state space (convex set) conv(Ω X ) which is the same for all systems (X, F X ) with F X ∈F X . A representative element is the probabilistic structure corresponding to the (effect) unrestricted system. Remark 2 (On the link between tomographically equivalent probabilistic structures and restriction of effects). The notion of tomographically equivalent probabilistic structures can be cast in terms of restriction of effects. A state space is effect unrestricted when all linear functionals GL(R[F G/H ] * ) → [0, 1] correspond to allowed measurement outcomes. A system is restricted when some of the mathematically allowed functionals do not represent any measurement outcomes of the theory. However when a system has restricted effects, it is always the case that the allowed effects span the dual space V * of the state space embedded in V . In other words both the restricted and unrestricted systems have the same mixed states (the restricted effects are always such that they separate the initial state space). A system with restricted effects has a tomographically equivalent probabilistic structure to the unrestricted system. Two tomographically equivalent probabilistic structures can be obtained by restriction of a common probabilistic structure.

Topological and differentiable features of the probabilistic representation
In the case where the dynamical structure (X, G) has topological/differentiable features, we assume that the maps Ω : X → V and Γ : G → GL(V ) are continuous/smooth, implying that the action Ω x → Ω gx is also continuous/smooth. In the cases where F X separates X and G (implying that Ω and Γ are injections) the inverse maps Ω −1 and Γ −1 are also assumed to be continuous, implying that Ω X and Γ G are homeomorphic/diffeomorphic to X and G respectively.
For topological groups a group representation is a continuous homomorphism Γ : G → GL(V ) (smooth for Lie groups), hence the continuity/smoothness of Γ will allow us to make use of the representation theory of topological/Lie groups in the rest of the work. Since R[F X ] is assumed to be finite dimensional continuity of these maps entails that the functions f are continuous on X.

Remark 3.
The assumption that Ω : X → V and Γ : G → GL(V ) are continuous/smooth is justified by the following. First note that continuity of these maps implies that Ω X and Γ G are isomorphic to X and G not just as set/groups, but as topological spaces/topological groups (differentiable manifolds/Lie groups). Consider the case where the dynamical structure has topological/differentiable structure but the maps Ω and Γ are not continuous/smooth, i.e. Ω X and Γ G are not homeomorphic/diffeomorphic to X and G respectively. Then given access only to the operational system, with state space conv(Ω X ) and transformation space conv(Γ G ) and asked to reconstruct the dynamical structure, we would not assign it a dynamical structure with X a topological space acted on continuously by the group G (or differentiable manifold acted on smoothly by a Lie group). Rather we would assign it the set of pure states X without any topological structure. To summarise: the operational perspective begins from some experimental data, then constructs the convex state, transformation and effect spaces and only then can one infer the pure states and reversible transformations of those systems (i.e. dynamical structure). From this perspective the dynamical structure has all the structural properties of the experimentally determined Ω X and Γ G , implying that the maps Ω and Γ must preserve these structures.

Non-compactness and "Possibility of state estimation using a finite outcome set"
Although we have restricted our attention to probabilistic systems with X and G compact, there are well defined probabilistic systems which are non-compact such as infinite dimensional quantum systems. These systems violate "Possibility of state estimation using a finite outcome set" and a natural question to ask is whether "Possibility of state estimation using a finite outcome set" rules out non-compact dynamical structures in general. In the following we no longer assume X and G compact, and briefly explore this question.

Lemma 2.
Under the assumption of "Possibility of state estimation using a finite outcome set" the sets conv (Ω X ) and conv (Γ G ) are compact.
Observe however that compactness of conv (Ω X ) and conv (Γ G ) does not imply compactness of Ω X and Γ G . There exists compact subset of R n whose extremal points are not closed (for an example of such a set see [64, proof of Lemma 0.22]). As such one cannot use compactness of a convex set to infer compactness of its extremal points. Although there exist compact convex sets in R n with a non-compact set of extremal points, it is not known to the authors whether any such sets where the extremal points are transitive under some group action exist. As such it may be the case that for transitive dynamical structures "Possibility of state estimation using a finite outcome set" imposes that X and G are compact.
In the case of transitive non-compact dynamical structures with a non-compact group one can make use of group representation theory to rule out the existence of probabilistic structure compatible with the assumption of "Possibility of state estimation using a finite outcome set". For many non-compact groups G (such as non-compact simple Lie groups) it is known that there are no non-trivial finite dimensional continuous unitary representations, which rules out probabilistic structures which violate "Possibility of state estimation using a finite outcome set" for dynamical structures with those groups. An open question is whether are there any transitive dynamical structures, where X and G are non-compact, which are consistent with "Possibility of state estimation using a finite outcome set".

Classification theorem
Before stating the main theorem of this section we will need to define the notion of a Gelfand pair.
Here (Γ, V, F) refers to a representation Γ : G → GL(V ) over a field F. Definition 12 (Gelfand pair). A pair (G, H) with G a group and H a subgroup of G form a Gelfand pair when for all irreducible representations (Γ, V, C) of G, the restriction Γ |H has at most one trivial sub-representation.
In other words, for a Gelfand pair (G, H), every irreducible representation (Γ, V, C) of G is such that all the vectors v ∈ V which are invariant under H span a subspace of dimension at most 1.
This definition applies to complex irreducible representations. For irreducible representations over the field R the restriction Γ |H may contain two trivial sub-representations, however all Hinvariant vectors are related by invertible transformations which commute with the group action (this does not contradict Schur's Lemma, which applies to irreducible representation over the complex field). More details and proofs can be found in Appendix C.
A representation (Γ, V, C) of a group G which has a non-zero H-invariant vector (i.e. for which Γ |H contains a trivial sub-representation) is called a spherical representation of (G, H).

Theorem 2 (Classification theorem). Let D = (G, H) be a transitive dynamical structure, and let us consider probabilistic structures
i. Every probabilistic structure F G/H (up to tomographic equivalence) has an associated representation Γ of the form: where each term (Γ j , V j , R) is a real-irreducible representation with at least one trivial subrepresentation when restricted to H.
ii. Conversely every representation of the form (15) (where each irreducible representation in the decomposition has at leat one trivial subrepresentation when restricted to H) is associated to at least one probabilistic structure F G/H .
iii. When (G, H) forms a Gelfand pair the correspondence between representations (Γ, V, R) of the form (15) and probabilistic structures (up to tomographic equivalence) F G/H is one-toone.
iv. When (G, H) does not form a Gelfand pair then some representations (Γ, V, R) of the form (15) have infinitely-many tomographically inequivalent probabilistic structures F G/H associated to them.
This theorem is proven in Appendix D.4. Parts i. and iii. entail that for a dynamical structure (G, H) which form a Gelfand pair one can classify all possible probabilistic structures F (up to equivalence) by finding the irreducible representations Γ of G such that Γ G|H has a trivial representation.
Parts iii. and iv. tell us that for Gelfand pairs all inequivalent probabilistic structures are characterised by different representations of G. Therefore for Gelfand pairs all probabilistic structures are in one-to-one correspondence with representations of the dynamical group, up to restriction of effects. For non-Gelfand pairs there are inequivalent probabilistic structures which are associated to the same representation of G.
The one-to-one correspondence between probabilistic structures and representations for Gelfand pairs is a direct consequence of the existence of an invertible transformation which commutes with group action for all invariant H-vectors (see Corollary 5 in Appendix C). For real irreducible representations which are also complex irreducible this is just the identity (by Schur's lemma), however for real irreducible of complex type (i.e. which are complex reducible) the linear space of transformations which commutes with all H-invariant vectors is two dimensional. As shown in Lemma 10 there are no real irreducible representations of quaternionic type which have an H-invariant vector when (G, H) Gelfand.
We observe that this theorem does not guarantee that for a given representation Γ of the form (15) the associated OPF set F separates the pure states. For instance the trivial representation Γ : G → GL(R), Γ(g) = I R for all g ∈ G is such that any vector v ∈ R is H-invariant, and the state space obtained for any choice of non-zero reference vector v is trivial:

Rigidity of dynamical structures
In this section we analyse which probabilistic structures can be continuously deformed 2 . We first study which dynamical structures (G, H) have probabilistic structures which are arbitrarily close. Following this we show that these probabilistic structures can be continuously deformed. In order to do so, we define an operational distance between probabilistic structures in terms of how difficult is to discriminate them.
Obviously, one can always smoothly deform a probabilistic structure by restricting the set of OPFs; for example, by adding noise to the measurements 3 . However, all these variants have the same set of mixed states, or in other words, the same equivalence classes of ensembles of pure states {(p i , x i )} i . We call all these probabilistic structures tomographically equivalent because, in estimation processes with multiple measurements, they agree on the set of mixed states. In each tomographically-equivalent class of probabilistic structures there is a privileged element: the unrestricted probabilistic structure. This F includes all linear maps Λ : V → R that map pure states to probabilities Λ : Ω(X) → [0, 1]. In order to avoid considering trivial deformations (i.e. those which leave the space of mixed states unchanged), in this section, we only consider unrestricted probabilistic structures.
A probabilistic structure F 0 for which every other probabilistic structure F 1 of the same linear dimension is at a finite bounded distance is called rigid. In other words, once the dimension of the space of mixed states is fixed, there is a finite bound on the minimal error when discriminating between probabilistic structures compatible with that dimension.
Theorem 2 tells us that if a dynamical structure (G, H) is a Gelfand pair then the set of unrestricted probabilistic structures is countable. We prove that each finite-dimensional probabilistic structure of a Gelfand pair (G, H) is rigid. We show that for non-Gelfand pairs there exists probabilistic structures F 0 which are not rigid, and which can be continuously deformed to other probabilistic structures of the same linear dimension.

Distance between inequivalent probabilistic structures
For a given dynamical structure (G, H) (with X ∼ = G/H) there is a natural notion of distance between probabilistic structures F X . The distance between two OPFs f 0 ∈ F 0 X and f 1 ∈ F 1 X is given by: This distance is directly related to the minimal error made when discriminating between f 0 and f 1 . We define the distance between two probabilistic structures F 0 X and F 1 X as: which informs us about the error that we make when certifying that a system behaves according to F 0 X and not F 1 X in the optimal experimental setting f 0 ∈ F 0 X . Note that D is not symmetric and hence it is not a metric distance. We introduce the symmetrised distance: which is a metric distance. The following theorem (proven in Appendix E.1) provides us with a lower bound on the distance between certain pairs of probabilistic structures F 0 and F 1 .

Theorem 3.
Let the dynamical structure (G, H) be a Gelfand pair, and let F 0 and F 1 be two unrestricted probabilistic structures of (G, H). If F 0 has an irreducible representation of dimension d 0 which does not appear in F 1 then Now we recall that for Gelfand systems, two unrestricted probabilistic structures are equal if and only if they have the same irreps in their decomposition. Hence, the above theorem implies that, for Gelfand systems, each pair of unrestricted probabilistic structure can be discriminated by finite means.

Rigid and non-rigid probabilistic structures
Theorem 3 tells us that, if we fix a dynamical structure consisting of a Gelfand pair (G, H), then the hypothesis that "the observed data is generated by a particular probabilistic structure F 0 of (G,H)", in opposition to "the observed data is not generated by F 0 ", can be tested with finite means. We now look at the property of rigidity of probabilistic structures, i.e. which probabilistic structures are such that every other probabilistic structure of the same linear dimension is at a finitely bounded distance.
ii. If there is a pair of H-invariant vectors in R[F 0 ] that are not related by any invertible transformation which commutes with Γ G then F 0 is non-rigid and for any > 0 ( 1) there is an inequivalent probabilistic structure This theorem is proven in Appendix E.2. For Gelfand pairs all irreducible spherical representations Γ G are such that all pairs of Hinvariant vectors are related by invertible transformations which commute with Γ G , hence all probabilistic structures for Gelfand pairs are rigid. For non-Gelfand pairs there exist probabilistic structures F 0 which have associated representations such that for all pairs of H-invariant vectors there is no invertible transformation relating them which commutes with Γ G . Hence we have the following corollary: H) is a Gelfand pair, then every unrestricted probabilistic structure F G/H is rigid. In Lemma 4 we show that these probabilistic structures can be continuously deformed to other probabilistic structures of the same linear dimension. Deformation of probabilistic structure is defined in Section 4.4 where deformation maps between different probabilistic structures are explicitly characterised.
Before studying the general case we provide an example of a non-rigid probabilistic structure and how to continuously deform it.

Continuous deformation of probabilistic structures: an example
In this section we analyse a dynamical system (G, H) that is not a Gelfand pair. Hence, some of its probabilistic structures can be continuously deformed, giving rise to varying statistical properties. This is an interesting feature of GPTs that has not been explored in the literature.
Definition 13 (The deformable state space). Consider the set of pure states Ω X = {U Ω x0 U † : ∀ U ∈ SU(3)} generated by the adjoint action of G = SU(3) on the reference state where the three real coefficients α i are different (α i = α j for all i, j) and add up to one i α i = 1. This state space has stabiliser subgroup We observe that the embedding of the pure states is given Ω : X → V where V is the real space of 3 × 3 Hermitian matrices. In this representation outcome probabilities are given by effects (i.e. linear functionals of the states Ω x ), hence they are given by the trace inner product: The case with two equal coefficients is equivalent to the familiar three-level quantum system, (23). This illustrates how we cannot deform quantum theory without changing its stabiliser subgroup and hence changing its dynamical structure (G, H). By contrast, the abovedefined family of state spaces can be deformed without changing the stabiliser nor the dynamical structure.
The pair (SU(3), S (U(2) × U(1))) is a Gelfand pair [51], whereas (SU(3), S (U(1) × U(1) × U(1)) is not. To show this we need to find a single irreducible representation Γ of G such that the restriction Γ |H to the subgroup H has more than one trivial subrepresentation. Take the adjoint action of G (see Definition 13) acting on the full complex space of complex matrices; this representation decomposes into a trivial representation of G (acting on the subspace spanned by the identity matrix) and the adjoint representation acting on the complementary subspace. This subspace is spanned by the trace 0 complex matrices and carries an irreducible representation of G ∼ = SU(3) (the adjoint representation). We observe that all diagonal matrices are invariant under the adjoint action of the subgroup H defined in Equation (23), and as such the space of H-invariant vectors is spanned by the trace 0 diagonal matrices. This is a 2 dimensional subspace of the full space of trace 0 matrices carrying the irreducible representation of G implying that (SU(3), S (U(1) × U(1) × U(1)) is not a Gelfand pair.
We know that the three-level quantum system has three perfectly distinguishable states. The following theorem tells us that this is not the case for above-defined state spaces.
Theorem 5. All state spaces introduced in Definition 13 have two perfectly distinguishable states and no more.
Proof. In the following proof we write x instead of Ω x (similarly X instead of Ω X ).
Let us start by assuming the existence of three perfectly distinguishable states x 1 , x 2 , x 3 ∈ V . This implies the existence of a three-outcome measurement A 1 , A 2 , A 3 ∈ V such that tr(A i x j ) = δ ij . Without loss of generality we can take the three states to be pure x i ∈ X ⊆ V . In the following analysis we use a V -basis where A 1 is diagonal The probability of A 1 with any state x only depends on the diagonal of the state (in this basis).
Therefore, in what follows, we characterise the projection of conv (X) ⊆ V into the diagonal. A general state U x 0 U † has diagonal projection The unitarity of U implies that |U ij | 2 is a doubly stochastic matrix, and Birkhoff's theorem tells us that |U ij | 2 is a mixture of the six permutation-matrices of three elements. Conversely, the six permutation matrices can be written as |U ij | 2 . This, together with the convexity of the state space, implies that the projection of convX into the diagonal is the convex set generated by the six extreme points where S 3 is the group of permutations of 3 elements. These six points (also denoted y 1 , y 2 , . . . , y 6 ) are depicted in Figure 1.
Let us show that each of the three pairs of opposite lines in Figure 1 are indeed parallel. For example: Condition tr(A 1 x j ) = δ 1j implies that the scalar product (γ 1 , γ 2 , γ 3 ) · y σ takes the value zero for two permutations σ and the value 1 on at least one permutation. However, as shown in Figure 1, the only outcome that tells apart states y 1 from y 4 , y 5 is A 1 , which gives probability one for states y 1 , y 2 and zero for y 4 , y 5 . It is worth mentioning that the vectors y 1 , y 2 , y 4 , y 5 correspond to four pure states with zero off-diagonal components. Hence, the outcome probabilities of these pure states can be calculated by only looking at the diagonal projection (i.e. the figure).
The figure also shows that, no matter how we choose the direction A 2 , the states y 4 , y 5 (or y 1 , y 2 ) cannot be perfectly distinguished. This proves the non-existence of three perfectly distinguishable states in this family of state spaces. Of course, this argument breaks down for the three-level quantum system, when the projection becomes a triangle instead of an hexagon.
The state spaces under consideration (Definition 13) have a remarkable property that is not present in quantum theory. This property is sometimes called "violation of no-simultaneous encoding" [65] and it is very similar to "information causality" [66]. This property allows to perfectly encode one bit of information (e.g. y 1 , y 2 versus y 4 , y 5 ) and simultaneously imperfectly encode another bit (y 1 , y 5 versus y 2 , y 4 ). Although only one of the two bits can be retrieved, there is a sense in which this system encodes more than one bit of information despite having only two perfectlydistinguishable states. Different choices of (α 1 , α 2 , α 3 ) will give different success probability when optimally guessing the second bit. This is a statistical feature that distinguishes inequivalent values of (α 1 , α 2 , α 3 ) within the family of state spaces of Definition 13.

Continuous deformation of probabilistic structures: the general case
In the following we only consider systems up to tomographic equivalence, where two systems are tomographically equivalent if and and only if they have the same equivalence classes of ensembles (i.e. the same mixed states). For a given set of pure states X each equivalence class of probabilistic structuresF X induces a map Ω : X → V (where V ∼ = R n ) by Theorem 1. We will define a deformation of a probabilistic structure for X as a map between different images Ω X of Ω-maps for X (since there is a one-to-one correspondence between equivalence classes of probabilistic structures F X and images Ω X of Ω-maps) satisfying certain conditions which we will formalize with the aid of some further definitions. We only consider equivalence classes of probabilistic structures which separate X.
For a given dynamical structure (X, G) let us call F the space of all images Ω X (equivalently the space of equivalence classes of probabilistic structuresF X ). For a given d ∈ R + 0 let us call F d the space of all Ω X whose linear span is isomorphic to R d (equivalently the space of equivalence classes of probabilistic structuresF X such that R[F X ] ∼ = R d ). We observe that F d might be the empty set (if there are no representations of G of the form of Equation (15) acting on R d ). For a Gelfand dynamical structure all F d are countable sets, whilst for non-Gelfand pairs there exist F d which are uncountable.
The symmetrised distance defined in Equation (18) turns F d into a metric space. This metric induces a topology on F d .
We are interested in continuously deforming probabilistic structures, namely how to continuously vary one set of mixed states into another (i.e. one probabilistic structure into another) whilst keeping the linear dimension fixed. First we define a general transformation between sets of embedded pure states Ω X . Definition 14 (Trivial Ω-transformation map). For two systems S 0 X = X,F 0 X and S 1 X = X,F 1 X with associated maps Ω 0 : X → V 0 and Ω 1 : These trivial Ω-transformation maps allow us to map between any two probabilistic structures in F. A deformation of probabilistic structure is a specific map between probabilistic structures such that the linear dimension of the space of mixed states is unchanged.
Definition 15 (Deformation map). For two systems S 0 X = X,F 0 X and S 1 X = X,F 1 X with associated maps Ω 0 : X → V 0 and Ω 1 :
Proof. SinceF 0 X andF 1 X separate X, Ω 0 X and Ω 1 X are homeomorphic. Let us prove by contradiction and assume that M 0→1 is linear when extended to span R (Ω 0 X ). Let us callM 0→1 : span R (Ω 0 X ) → span R (Ω 1 X ) this linear extension of M 0→1 . We observe thatM 0→1 has a trivial kernel since no element of Ω 0 X is in the kernel, hence no element of span R (Ω 0 X ) is either. Moreover the image of M 0→1 is span R (Ω 1 X ): any element in span R (Ω 1 X ) is a linear combination i α i Ω 1 xi of elements in Ω 1 X , and hence by linearity ofM 0→1 i α i Ω 1 xi is the image underM 0→1 of the linear combination i α i Ω 0 xi in the domain span R (Ω 0 X ). HenceM 0→1 is an invertible linear transformation which takes conv(Ω 0 X ) to conv(Ω 1 X ) and conv(Ω 0 X ) ∼ = conv(Ω 1 X ). This contradicts the assumption that Ω 0 and Ω 1 are induced by tomographically inequivalent probabilistic structures.
A non-trivial deformation map M 0→1 cannot be extended to the spaces of mixed states conv Ω 0 X and conv Ω 1 X , which are not isomorphic as convex sets. For such systems one can use the map M 0→1 to deform the probabilistic structure F 0 X to F 1 X to change the space of mixed states from conv Ω 0 X to conv Ω 1 X whilst preserving the linear dimension. We observe that if we consider the general trivial Ω-transformation maps (i.e. the linear dimension of the two state spaces is no longer required to be the same) then some of these maps do linearly extend to span R (Ω 0 X ) when Ω 0 X = Ω 1 X . An example of such a map is the projection from a system S 0 X with an associated representation Γ 0 which is reducible to a system S 1 X with an associated representation Γ 1 which is a sub-representation of Γ 0 , i.e. Γ 0 ∼ = Γ 1 ⊕ Γ 2 for some representation Γ 3 of G.
We observe that the connected path γ(t) defines a family of deformation maps M 0→t : Ω 0 X → Ω t X for every t ∈ [0, 1].
In other words Ω 0 X can be continuously deformed to Ω 1 X if there exists a connected path between the two passing through a continuum of different Ω t X (each spanning a linear space of the same dimension). If there is no such path between the two then Ω 0 X cannot be continuously deformed to Ω 1 X .
In the following theorem we show the relation between the rigidity of a probabilistic structure and the possibility of continuously deforming a probabilistic structure.

Lemma 4. Any non-rigid probabilistic structureF 0
G/H can be continuously deformed into another (non-rigid) probabilistic structureF 1 . Any rigid probabilistic structureF 0 G/H cannot be continuously deformed to another probabilistic structure.
This lemma is proven in Appendix E.3. Together with Corollary 1 this lemma implies the following corollary. H) is a Gelfand pair, then no probabilistic structure can be continuously deformed. If (G, H) is a non-Gelfand pair then it has probabilistic structures which can be continuously deformed.

Corollary 2. If (G,
We now provide an explicit construction of a continuous deformation of a probabilistic structure whose associated representation has inequivalent H-invariant vectors; for an arbitrary non-Gelfand pair (G, H).

Example 1 (Continuous deformation). Consider a non-Gelfand pair (G, H), and an irreducible
Hence there is a map m(g) = Γ g RΓ g −1 between the two points v 0 gH ∈ Ω 0 X and v 1 gH ∈ Ω 1 X , for all points gH ∈ X ∼ = G/H. The deformation map is then: Since this map is g-dependent (i.e. its action depends on each extremal point v 0 gH ) it is not linear and cannot be extended to the mixed states.
Let us define Ω t X to be the state space generated by the Then the deformation map from Ω 0 X to Ω t X is given by: The family of deformation maps M 0→t (for t ∈ [0, 1]) defines a connected path γ(t) = M 0→t Ω 0 X = Ω t X in F d (see the proof of Lemma 4 in Appendix E.3). Hence the probabilistic structure associated to Ω 0 X can be continuously deformed to the probabilistic structure associated to Ω 1 X .

Remark 4.
The possibility of continuously deforming a probabilistic structure without altering its dynamical structure (and without restricting effects) is a very peculiar feature that is not found in any of the known GPTs (such as boxworld and quantum theory over the field of reals, complex or quaternions) to the best of our knowledge. Moreover we posit that this is a typical feature of GPT systems, in that, most dynamical structures (G, H) are not Gelfand pairs. If a probabilistic structure can be continuously deformed then the probabilities can be fine-tuned to suitably describe the observed statistics, and hence, make the theory more difficult to falsify. Hence, the fact that the probabilistic structure of a theory cannot be smoothly deformed makes the falsifiability of the theory more straightforward. We believe that this is a desirable property of a theory. If we consider a dynamical structure (G, H) being a Gelfand pair then we can be sure that any of its probabilistic structures will be straightforwardly falsifiable. Finally, it is important to mention that a dynamical structure (G, H) cannot be continuously deformed due to the group and sub-group structures of G and H. That is, adding a single element to G or H will generate lots of new elements via products and inverses. And hence, the probabilistic structure is the only part of a theory that, a priori, could be continuously deformed.

Gelfand pairs and two point homogeneity
In Theorem 2 we have singled out dynamical structures corresponding to Gelfand pairs as being of interest, namely for the convenient property that their probabilistic structures can be classified via the associated group representations. This implies that there are countably many of them, and that they cannot be continuously deformed. This rigidity is a highly desirable property for a fundamental theory of physics, because it does not allow for ad hoc parameter adjustment, and is thereby easier to falsify. Apart from this, one may also ask whether there are other informational/physical motivations for considering Gelfand pairs. One such reason may be the following.
Definition 17 (Two-point homogeneous action [67]). A group G acts two-point homogeneously on a metric space (X, dist) if for every pair of points (x 1 , x 2 ) and (x 1 , x 2 ) in X with dist(x 1 , x 2 ) = dist(x 1 , x 2 ) there is an element g ∈ G such that gx 1 = x 1 and gx 2 = x 2 . Two-point homogeneity implies transitivity, since for any points x 1 and x 2 we have dist(x 1 , x 1 ) = dist(x 2 , x 2 ) and hence there exists an element such that gx 1 = x 2 . The following is a very remarkable result.
Lemma 5 (Prop 2.2 [68]). If G acts two-point homogeneously on a metric space X and H is the stabilizer of a point, then (G, H) is a Gelfand pair.
The requirement of two-point homogeneity restricts us to dynamical structures corresponding to Gelfand pairs. We observe that this requirement requires an additional metric structure to be imposed on the dynamical structure. A natural metric on GPT state spaces is the following.
is bounded dist(x, x ) ≤ 1, it satisfies the metric axioms: and it is G-invariant Therefore we conclude that two-point homogeneous state spaces are Gelfand pairs. It is remarkable that the purely dynamical property of two-point homogeneity implies that all probabilistic structures are rigid.
However we note that not all Gelfand pairs (G, H) give rise to a homogeneous space X ∼ = G/H which is two-point homogeneous. Indeed the classification of all the compact and connected two point homogeneous symmetric spaces was given in [67]. These are listed in Table 1.
The full classification of all finite dimensional probabilistic structures for the compact connected two point homogeneous spaces G/H (where all pairs (G, H) corresponding to such spaces are given in Table 1) directly follows from the classification of all irreducible spherical representations. Equivalently these are the irreducible subspaces of the function space C(G/H, C) (continuous functions from G/H to C) under the action of G, where a specific basis for an irreducible subspace is given by spherical harmonics. This is a generalisation of the well known spherical harmonics for L 2 (S 2 ), where the irreducible representation labelled by l has a basis Y lm (θ, φ) spanning a 2l + 1 dimensional subspace.
The (G, H) spherical irreducible representations for these pairs are characterised by a condition on the highest weights given by the Cartan-Helgason Theorem [69,70](see [71,Theorem 11.4.10.]. Explicit characterisations of these (G, H) spherical irreducible representations (either in terms of the highest weights or other methods) for the pairs in Table 1 can be found in the literature.

Grassmannian systems
In this section we introduce a family of non-classical systems which generalise the dynamical structure of quantum systems, and make use of Theorem 2 to provide a full classification of these systems.
The pure states of finite dimensional quantum systems are given by PC d . This is the set of all one dimensional subspaces of C d . We now consider systems with pure states given by the set of all k-dimensional subspaces W ⊆ C d . This set is known as a Grassmann manifold Gr(k, C d ): Hence PC d ∼ = Gr(1, C d ). Since SU(d) acts transitively on Gr(k, C d ) it can also be expressed as follows (re-parametrising k = m and d = m + n): Here the embedding of S(U(m) × U(n)) into SU(m + n) is the direct sum embedding: Similarly one can define Grassmann manifolds over R and H, generalising the dynamical structures of quantum theory over R and H. These are: In the next section we will make use of Theorem 2 to classify all possible probabilistic structures for each dynamical structure which is a complex Grassmann manifold.

Full classification of all probabilistic structures for complex Grassmann manifolds
Theorem 2 states that for a dynamical structure (G, H) corresponding to a Gelfand pair, every probabilistic structure is in one-to-one correspondence with a spherical representation of (G, H).
Hence the first step in classifying probabilistic structures for the Grassmann dynamical structure Gr(m, C m+n ) ∼ = SU(m+n)/S(U(m)×U(n)) is to determine whether (SU(m + n), S(U(m) × U(n))) form a Gelfand pair.
The first part of the lemma is found in [72,Corollary 3] and the second part is proven in Appendix F. This lemma entails (using Theorem 2) that all probabilistic structures F X where X ∼ = SU(n + m)/S(U(m) × U(n)) are in one-to-one correspondance with the spherical representations (SU(m + n), S(U(m) × U(n)). Irreducible spherical representations are typically defined over C, and in general the irreducible representations of a group G over C are not in one-to-one correspondence with those over R. Part 2. of the lemma allows us to classify the real irreducible spherical representations of (SU(m + n), S(U(m) × U(n))) by studying the irreducible spherical representations over C.
The restriction of representations of SU(m + n) to S(U(m) × U(n)) has been studied in [72]. We summarise the result below. Representations of SU(m + n) are labelled by a partition λ of an integer k in m + n − 1 parts (often represented as a Young diagram). One can construct the associated irreducible representation by applying the Schur functor  (2b 1 , b 1 + b 2 , ..., b 1 + b m , b 1 − b m , ..., b 1 − b 2 , 0) .
When n ≥ m + 1: We have added a redundant 0 entry; with it λ has length m + n.
6.2 Quartic quantum theory over R, C and H Quartic quantum theory over C, introduced in [47], is a theory which contains some of the systems classified above. In this theory systems S Quart k,C (k ∈ Z, k > 2) have pure states given by the Grassman manifold Gr(k, C k 2 ) and a probabilistic structure F Quart k,C given by the adjoint representation. For example the state space for the system k = 2 can be generated by taking reference state: applying the SU(4) dynamical group in the adjoint representation: and taking the convex hull of the Gr(2, C 4 ) manifold embedded in V = Herm C 4 (the real linear space of Hermitian matrices on C 4 ). One problematic feature of quartic quantum theory is that it does not have well defined composition [25,47], and as such is just a collection of systems rather than a full theory.
We can similarly introduce two theories (without composition): real quartic quantum theory and quaternionic quartic quantum theory where systems are given by S Quart and Sp(k 2 ) respectively. In both cases the states space for the system associated to k = 2 can be generated by taking the reference state ρ above acting with the adjoint representation of the dynamical group and taking the convex hull. The OPF framework presented in this work is similar to the 'Group theoretical model' of [10]. The novel aspects of this work include Theorem 2 which, building on the framework, establishes a correspondence between probabilistic structures and group representations. We find specific conditions on transitive dynamical structures which make this correspondence one-to-one (namely that the dynamical group and stabilizer subgroup form a Gelfand pair). Moreover Mielnik studies examples with the same pure states as quantum theory, but different dynamical groups. We study (and classify) systems which have different pure states and dynamics.

Classification of all alternatives to the measurement postulates of quantum system
Theorem 2 is a generalisation of the classification theorem of [51], where the dynamical structure is no longer constrained to be that of quantum systems. We also find the necessary and sufficient conditions for which dynamical structures have probabilistic structures which are in one-to-one correspondence with group representations. and 'Non rigid' are notions defined in this paper. '2 point hom.' stands for two point homogeneous. For a field F, Gr F is the family of systems with pure states given by the Grassmann manifold Gr(F d , F k ) for all 2 < d < ∞, k < d. PF d is the family of systems with pure states given by projective space over F d for all 1 < d < ∞, hence PF d := Gr(F d , F 1 ). QT F is quantum theory over F whilst Qu F is quartic quantum theory over F. 'EJAs' labels special Euclidean Jordan Algebras (EJA) and 'EJAe' the exceptional EJA. V d is the d−sphere in the standard embedding in R d+1 whilst S d is the family of systems with pure states given by S d (hence embeddings of S d in R k where k not necessarily equal to d + 1). This map does not capture all the relations, namely there are 'coincidences' like the qubit being both in QT C and V d .

Mapping the space of GPT systems
In the GPT formalism a theory is considered to be a set of systems together with some composition rules. Quantum theory for example is the set of systems d=2 together with the standard tensor product composition rule and partial trace. We note that QT C alone is not a theory, just a set of systems.
In this work we also consider sets of systems which are not expected to form theories; these are sets of systems which share a common dynamical structure. For example the set of systems with shared dynamical structure PC 2 , SU(2) form a sub-family of systems, and the set of systems which contain all systems with dynamical structure PC d , SU(d) (d > 2) form a family of systems.
In this work we have introduced new families of systems (see Sections 5 and 6) which generalise previously known systems. In Figure 2 we map out the space of transitive systems with compact pure states including the new families of systems introduced in this work.
The advantage of the methods introduced in this work are two-fold: firstly we can generate examples of non-classical systems and secondly we can systematically classify non-classical systems, thus providing us with a fuller picture of non-classical systems lying beyond quantum theory.

The search for alternative theories and the issue of composition
The tools presented in this work allow us to systematically search for non-classical systems. However it is not certain that these systems compose in a non-trivial manner (existence of entangled states and measurements). For example it is shown in [53] that the only full theory with systems having the same dynamical structure as quantum theory is quantum theory itself. In [48] it is shown (under certain additional assumptions such as local tomography) that the only systems corresponding to d-balls which compose non-trivially are for d = 3. However, removing the requirement of local tomography allows for d-balls to compose non-trivially in at least some cases, including those of real and quaternionic bits [35] (where a category in which arbitrary d-balls combine via a monoidal product is also described, though the acceptability of these composites is less clear). Out of the family of systems classified in section 6 it is known that one of them (quartic quantum theory) does not compose under the assumption of local tomography and the requirement that the theory remain quartic [25,47]. The question remains open as to whether any of the systems with pure states given by Grassmann manifolds compose non-trivially, with or without the assumption of local tomography

Conclusion
In this work we have introduced the OPF framework which is used to characterise systems in GPTs. By separating the dynamical and probabilistic components of systems this framework provides new insight into non-classical systems. It allows us to consider families of systems which share a common dynamical structure. We introduce the notion of a rigid dynamical structure and show that for such structures one can classify all probabilistic structures using representations of the dynamical group. A key feature of rigid dynamical structures is that they do not admit continuous deformation of probabilistic structure.
Moreover we introduced multiple new families of non-classical systems, such as the complex Grassmann systems. Many of these families contain known non-classical systems, as well as providing infinitely many examples of non-classical systems which were not known. As well as exploring the space of non-classical systems by finding new examples, we mapped out this space in a more systematic manner by introducing families of systems which share a dynamical structure.
The present work has limited itself to single systems. In general it is not a given that these systems can be made to compose in a non-trivial way (i.e. with entangled states). Future work will involve extending the OPF framework to include composition (along the lines of [52,53]) and determining whether any of the new families of systems presented in this work can compose in a non-trivial manner. Finding such examples would provide new non-classical theories whose informational properties could be characterised and contrasted to quantum theory. Alternatively a proof that none of the Grassmann systems can compose would lend credence to the proposition that there are few fully compositional general probabilistic theories.

S X
System with pure states X Sym(X) Symmetric group on set X Diff(X) Group of diffeomorphisms on manifold X Ω x Image of Ω map for en element x Mixed state representation of an ensemble F X set of OPFs (outcome probability functions) for system with pure states X Extension of an OPF to ensembles/mixed states Complexification of a real representation W V R Restriction to reals of a complex representation V C(X, F) Continuous functions X → F for a topological space X and a field F Complex conjugate of a vector v

B Background group theory and group representation theory
We briefly outline a few concepts from group representation theory which will be needed for the proofs. This implies that Hom G (V, V ) = CI V for irreducible V . Schur's Lemma also has important consequences for reducible representations. For instance consider the case where dim(Hom G (V, W )) = 1 where V is irreducible and W is reducible. Then Schur's Lemma entails that W must contain the irreducible representation V exactly once, and that Hom G (V, W ) = CI V .

B.2 Left regular and C(G) representations
Definition 20 (Left regular representation (finite group)). The left regular representation of a finite group G is given by: where C[G] is a complex linear space spanned with orthonormal basis {|g } g∈G .
Here the action of G on C[G] is just permutation of the basis vectors.
Definition 21 (Left regular representation (compact group)). The left regular representation of a compact topological group G is given by: where f ∈ C(G, C).
ρ is a continuous homomorphism: In general the restriction of an irreducible representation G will give a reducible representation H.

Example 2.
Consider the fundamental representation of SO(3) on R 3 and restrict to the subgroup SO(2) with matrices: Similarly we have that Hom G (W, . By Theorem 7 we obtain:

C Representations over R
A representation Γ is a homomorphism G → GL(V ) for some vector space V . Typically V is assumed to be a vector space over C, and many tools in representation theory apply to this case. Schur's lemma for instance holds for representations over the complex field, but not always for those over R. For a Gelfand pair (G, H) the property of having a trivial irreducible representation when restricted to H which is of multiplicity 0 or 1 holds for representations over C, and not necessarily R.
In this section we explore some of the subtleties involved in dealing with representations over R and prove some lemmas which will be needed for Theorem 2. First we present an example to introduce some of the relevant concepts. (2)). Consider the representation of SO(2) over R 2 :

Example 3 (Fundamental representation of SO
This representation is irreducible over R 2 . However consider this representation acting on C 2 (obtained from R 2 by allowing complex linear combinations of the basis elements). Then there exist the following matrices S and S −1 : such that Γ (θ) = S −1 Γ(θ)S, with So the irreducible representation over R 2 is reducible over C 2 . Consider once again the irreducible representation Γ over R. Then this commutes with all matrices proportional to the identity, but also matrices proportional to J, where J is: Moreover one can show that only matrices which are linear combinations of J and I commute with the whole group.

C.1 Definitions
Definition 25 (Real, complex and quaternionic structure). Consider an irreducible representation ρ, V with V a complex vector space. Then V has a real structure if there exists an equivariant anti-linear map j : V → V such that j 2 = I, V has a quaternionic structure if there exists an equivariant anti-linear map j : V → V such that j 2 = −I. Otherwise V has a complex structure.
Lemma 9 (Descent map). Given a representation (Γ, V, C) of a group G over a complex space V equipped with a real structure j the projection V → V j with kernel the j = −1 eigenspace of V is a descent map. V j : {v ∈ V : j(v) = v} carries a real representation of G.
Observe that this set is closed under real linear combinations, and not complex linear combinations. As such it has the structure of a real vector space.
Typically j is the conjugation map: j(v) =v. We see that it is equivariant when j(gv) =ḡv, i.e. the representation g andḡ are isomorphic.

Definition 27 (Restriction of scalars). A complex vector space V is isomorphic (as a real vector space) to the real vector space
Consider V a complex vector space and W a real vector space.
whereV is the complex conjugate vector space of V .V has the same elements and addition rule as V , but scalar multiplication is given by λ · v =λV . Theorem 8. [73,Theorem 3.37,p.41] For an irreducible representation (ρ, W, R) the action of G extends naturally to W C . It has one of the following decompositions: 1. W C ∼ = V for a complex irreducible representation V if and only if V has a real structure.
2. W C ∼ = V ⊕V for a complex irreducible representation V if and only if V has a complex structure.
3. W C ∼ = V ⊕ V for a complex irreducible representation V if and only if V has a quaternionic structure.
Every real irreducible representation (ρ, W, R) can be obtained from irreducible complex representations of one of the above three forms, by taking the descent map W j C .

C.2 Lemmas
In the following section G is assumed finite or compact. 1. An irreducible representation Proof.
C for the map j(v ) =v , we have that under the descent map the subspace V H is mapped to the (real) one dimensional subspace V H of V , spanned by v H ⊗ R γ with γ ∈ R. Any vector v such that j(v ) = v which is not in V H will be mapped to a vector in V which is not H-invariant (since j is equivariant). As such the subspace of all H-invariant vectors in V is just V H and is one dimensional. The case where V H is of dimension 0 is immediate.
2. By Lemma 10 if Γ is irreducible over R but reducible over C then it has a complex structure and is of the form ρ ⊕ ρ * where ρ and ρ * are irreducible. Since (G, H) is a Gelfand pair either both ρ and ρ * contain a one dimensional H-invariant complex subspace or neither do. Let us consider the case where they both do.
We use a matrix representation where the entries of ρ * are the complex conjugates of those of ρ. Let v i be a basis for V , then form a basis for V ⊕V . By also considering the elements: we obtain a real basis for V ⊕V , i.e. any w ∈ V ⊕V is a real linear combination of the above vectors.
be an H invariant vector, then so is: All H-invariant vectors are of the form: since the H invariant subspaces in V andV are one dimensional.
Now we consider the change of basis given by matrices S and S −1 : and apply SΓ S −1 to obtain has real valued entries. The action of S on the basis is: Considering the real basis elements we also obtain: Under the real structure j(v) =v the basis vectors w + i and iw − i are +1 eigenvectors of j, and form a basis for V j ∼ = R n . The other basis vectors are −1 eigenvectors and form a basis for iR. Since Γ =Γ we have that j(v) = v =⇒ j(gv) = gv and so the subspace V j is closed under G.

The image of the H-invariant vectors under S is:
These are invariant under j for β = α. Any such j invariant vector is a real linear combination of w 1 H = w H (α = 1, β = 1) and w 2 H = w H (α = i, β = i). As such the H-invariant subspace in V j is two dimensional with basis vectors written out explicitly as: Proof. In the case where Γ is complex irreducible this is immediate from Schur's lemma. In the case where Γ is complex reducible, then its complexification is reducible: By Schur's lemma all matrices which commute with the whole group are of the form: Using the transformation S of the previous proof one obtains:

Corollary 6.
For Γ, W an irreducible representation over R n , then the following holds: 1. If W C is irreducible then Hom G (R n , R n ) = R . The only equivariant homomorphisms are scalar multiples of the identity.
2. If W C is reducible into irreducible representations with complex structure then Hom G (R n , R n ) ∼ = R 2 .
where V j are the irreducible representations of real type, and U j are irreducible representations of complex type. There are no degeneracies.
Hence every irreducible representation W i is sent to (W i ) C in C[G/H]. By Theorem 8 these are of the form V for V irreducible of real type, V ⊕V for V of complex type and V ⊕ V for V of quaternionic type, where we know by Lemma 10 that this latter case does not occur.
By assumption Ω x → Ω gx is a continuous group action when G is a continuous group. Moreover it extends to conv(Ω X ) as follows This uniquely extends to span(Ω X ) ∼ = R[F X ] * and hence Γ : G → GL(R[F X ] * ) is a group representation.

D.2 Proof of Lemma 1
Proof. Let Ω, Γ, Λ be an embedding of Ω X can be generated by applying Γ g to a reference vector Ω x0 for all g ∈ G. Since G is compact the matrices Γ g can be made orthogonal. Hence every Ω x = Γ g Ω x0 for some g ∈ G is such that Ω T x Ω x = Ω T x0 Ω x0 .
All Ω x are such that u · Ω x = 1 with u the unit effect, therefore Ω X lies in the affine span of Ω X which is a d − 1 dimensional hyperplane. This is the hyperplane composed of all v ∈ V such that u · v = 1. Every Ω x can be written as: 1 in a basis where the first element corresponds to the subspace spanned by the unit effect. Observe that Γ G acts trivially on the subspace spanned by the unit effect, and hence decomposes as Γ = 1+Γ where 1 is the trivial representation. Observe that the maximally mixed state ω = g∈G Γ g Ω x dg for an arbitrary Ω x is invariant under G. Hence it lies fully in the subspace spanned by the unit effect. Therefore ω can be written as: whereω is the zero vector. Restricting our attention to the affine span of Ω, i.e. ignoring the unit effect subspace we observe that the matricesΓ are orthogonal, henceΓ gΩx0 lies on a d − 1 sphere centred on ω.
Since Ω X lies on a hypersphere, this implies that no point in Ω X lies inside conv(Ω X ), proving the lemma.

D.3 Proof of Lemma 2
Proof. In the representation in terms of fiducial OPFs the vectors conv(Ω X ) are bounded. Moreover they are closed, since any limit of physically realisable extremal preparations is indistinguishable from an extremal physical preparation. Hence conv(Ω X ) is a closed and bounded subset of R[F X ] * (isomorphic to R n by the assumption of "Possibility of state estimation using a finite outcome set"), and by the Heine-Borel theorem it is compact in the induced (subspace) topology.
Since conv(Ω X ) is bounded this implies that the absolute value of the matrix entries of conv(Γ G ) are also bounded. Moreover we assume that from a physical point of view the space of transformations conv(Γ G ) ⊂ R n must be topologically closed (in the vector space topology of the space R n 2 of linear transformations R n → R n ), since any mathematical transformation which can be arbitrarily well approximated by physical transformations is indistinguishable from a physical transformation. This implies that the set conv(Γ G ) ⊂ R n 2 is bounded and closed, and hence compact by the Heine-Borel theorem. In general Γ may be reducible: We can also decompose states: We have the following equality: Since x is stabilized by H x (where H x is stabilizer of x: hx = x for all h ∈ H x ) we have This implies This implies that each Γ i |H has at least one trivial sub-representation. ii. Let us consider a representation of G acting on a real vector space V : such that each Γ i has at least one H-invariant subspace. Take a reference vector v ∈ V which has support only in the H-invariant subspaces, and has support in each subspace V i . By applying Γ G to v we obtain Ω G/H ∈ V , where we observe that conv Ω G/H has Ω G/H as extremal points, since Γ G can be expressed in orthogonal matrices and hence Ω G/H ⊂ S n (a hyper-sphere in the affine span of the normalised states, centred on the maximally mixed state) as proven in Lemma 1. By taking the convex set of all linear functionals which give values in [0, 1] we obtain Λ F a probabilistic structure.
iii. Since (G, H) is a Gelfand pair, all real irreducible representations V i are such that any pair of H-invariant vectors are related by an invertible linear transformation L i . For a representation: take two H-invariant vectors v and v which have support in every irreducible subspace. These are related by a transformation L: Lv = v which commutes with the group action. We call their orbits under Γ G Ω X and Ω X . Since L commutes with the group action LΩ X = Ω X . Let us consider the unrestricted effect spaces for both: Λ F and Λ F . Λ F (Ω X ) = Λ F (LΩ X ). The set of all effects on LΩ X is just Λ F L −1 , hence Λ F = Λ F L −1 .
iv. Let us take the case where (G, H) is not a Gelfand pair. There exist (complex) irreducible representations W such that W |H contains more than one trivial sub-representation.
One can obtain a real irreducible representation from W by one of the three following methods: (a) Restriction to R of W if W is of real type.
(b) Restriction to R of W ⊕W if W is of complex type.
(c) Restriction to R of W ⊕ W if W if of quaternionic type In each case the real irreducible representation V obtained is such that it has invariant Hvectors which are not related by a transformation which commutes with the group representation. Let us fix a basis and consider the matrices Γ g acting on V . Let us pick two H-invariant vectors v and v such that there is no transformation which commutes with the group action such that Lv = v .
We now show that v and v can be used (along with Γ G ) to define two tomographically inequivalent probabilistic structures, associated with the same carrier space V .
Let us call x 0 an element in X stabilised by H and define Ω x0 = v and Ω x0 = v . Furthermore let us define Ω gx0 = Γ g Ω x0 for every g ∈ G (and similarly for Ω gx0 ). Observe that for any g 1 , g 2 such that g 1 x 0 = g 2 x 0 we have Ω g1x0 = Ω g2x0 (and similarly for Ω g1x0 = Ω g2x0 ). This defines the maps Ω : X → V and Ω : X → V . We call Ω X and Ω X the orbits of Ω x0 and Ω x0 under Γ G .
We call Λ F ∈ V * the set of all linear functionals giving values in [0, 1] on conv (Ω X ) and Λ F ∈ V * the set of all linear functionals giving values in [0, 1] on conv (Ω X ). Necessarily span(Λ F ) = span(Λ F ) = V * . Λ F and Λ F define two OPF sets F X and F X respectively.
We have constructed two systems (X, G, F X ) and (X, G, F X ) with associated maps (Ω, Γ, Λ) and (Ω , Γ, Λ ) respectively, associated to the same representation Γ. We now show that these two systems are tomographically inequivalent, i.e. the mixed states conv (Ω X ) and conv (Ω X ) are not isomorphic. In other words there is no linear invertible transformation mapping conv (Ω X ) to conv (Ω X ).
Consider an invertible linear transformation such that LΩ X = Ω X (if such a transformation exists, then it is well defined on the convex hulls and maps conv (Ω X ) to conv (Ω X )). This implies that LΩ x = Ω gx for some fixed g ∈ G and for all pure states x ∈ X. Let us redefine L → Γ g −1 such that the equality LΩ x = Ω x for all x ∈ X holds.
Hence there is a transformation L which commutes with the group action such that LΩ x0 = Ω x0 . This is in contradiction with the assumption that the two reference states were not related by such a transformation.
E Deformation of probabilistic structure E.1 Proof of Theorem 3 Proof. Let Γ i : G → GL(V i ) be the representation of G associated to F i , let Ω i : X → V i be the representation of pure states, and let Λ i : F i → (V i ) * be the representation of OPFs, for i = 0, 1. We can decompose the group action into (real) irreducible representations as Γ i = j Γ i j , , where j = 0, 1, . . . Recall that there must be one (and only one) trivial irrep, which we label by j = 0. Also, we can decompose the representation of pure states Ω i = j Ω i j and Ω i j : X → V i j , and the representation of OPFs Λ i = j Λ i j and Λ i j : F j → (V i j ) * . Within this proof we follow the convention that for all i, j, x, where ·, · j is a group-invariant scalar product. This scalar product provides an isomorphism (V i j ) * ∼ = V i j . Using this decomposition we can write any f i ∈ F i as where we define where d i j = dimV i j . Now, suppose that F 0 and F 1 are unrestricted and tomographically-inequivalent probabilistic structures of (G, X). Then, without loss of generality, we assume that irrep Γ 0 1 is inequivalent to irrep Γ 1 j for all j. Also, let us choose f 0 ∈ F 0 so that its corresponding vector Λ 0 (f 0 ) only has support on the subspaces j = 0, 1, that is