Contextuality in composite systems: the role of entanglement in the Kochen-Specker theorem

The Kochen--Specker (KS) theorem reveals the nonclassicality of single quantum systems. In contrast, Bell's theorem and entanglement concern the nonclassicality of composite quantum systems. Accordingly, unlike incompatibility, entanglement and Bell non-locality are not necessary to demonstrate KS-contextuality. However, here we find that for multiqubit systems, entanglement and non-locality are both essential to proofs of the Kochen--Specker theorem. Firstly, we show that unentangled measurements (a strict superset of local measurements) can never yield a logical (state-independent) proof of the KS theorem for multiqubit systems. In particular, unentangled but nonlocal measurements -- whose eigenstates exhibit"nonlocality without entanglement"-- are insufficient for such proofs.This also implies that proving Gleason's theorem on a multiqubit system necessarily requires entangled projections, as shown by Wallach [Contemp Math, 305: 291-298 (2002)]. Secondly, we show that a multiqubit state admits a statistical (state-dependent) proof of the KS theorem if and only if it can violate a Bell inequality with projective measurements. We also establish the relationship between entanglement and the theorems of Kochen--Specker and Gleason more generally in multiqudit systems by constructing new examples of KS sets. Finally, we discuss how our results shed new light on the role of multiqubit contextuality as a resource within the paradigm of quantum computation with state injection.


Introduction
Quantum theory's 'departure from classical lines of thought' [1] is today a driving force behind the promise of quantum technologies. Quantum theory abandons assumptions implicit in classical physical theories, such as the assumptions that physics must be fundamentally deterministic or that all observables are jointly measurable. This allows for an array of typically quantum phenomena such as entanglement, uncertainty relations, Bell nonlocality, contextuality, etc. We generically refer to the possibility of these previously forbidden properties as the nonclassicality of quantum theory. Such nonclassical properties of quantum theory are key to the many advantages quantum information processing and quantum computation hold over their classical counterparts. However, the exact relationship between different notions of nonclassicality and such advantages is often unclear and remains an active area of research [2,3,4,5,6,7,8,9].
The case of multiqubit systems is of particular importance given their ubiquity throughout quantum technologies, particularly noisy intermediate-scale quantum (NISQ) technologies [10,11]. However, the nonclassicality of qubit systems remains an anomalous case. In-dividually, for example, a qubit cannot display Kochen-Specker (KS) contextuality [12] while, collectively, multiqubit systems (which exhibit KS-contextuality) derail the neat narrative of such contextuality powering quantum computational advantage [5,13]. Furthermore, entanglement is often considered a key indicator of nonclassicality in these systems but the sense in which it relates to fundamental notions of nonclassicality witnessed by the theorems of Bell [14,15], Kochen-Specker [12], and Gleason [16] needs clarification. Since the question of nonclassicality is essentially a foundational one [17], we approach the study of multiqubit systems through this lens.
Entanglement is an intrinsically compositional property and is, therefore, only relevant to the study of nonclassicality in composite (i.e. multipartite) systems. Schrödinger claimed the entanglement of quantum states to be 'the characteristic trait of quantum mechanics, the one that enforces its entire departure from classical lines of thought' [1]. Bell's theorem [14,15] exemplifies this point by revealing the nonclassicality of correlations arising from local measurements on composite quantum systems in entangled states, the simplest case being a two-qubit system. The Kochen-Specker theorem [12], on the other hand, can reveal the nonclassicality of correlations between measurements implemented on an indivisible quantum system, the simplest case being a qutrit. For quantum systems of dimension at least three, the theorem states that there cannot exist an underlying ontological model-known as a KSnoncontextual ontological model-that reproduces the predictions of quantum theory. In such a model the outcomes of projective measurements are fully determined by the ontic state of the system. Furthermore, the outcomes are independent of context 1 and respect the functional relationships between commut- 1 The context of a projective measurement here refers to other projective measurements with which it is jointly performed.
ing measurements. 2 The key insight of Kochen and Specker was that one can witness the impossibility of such models with a finite set of rank-1 projections (a KS set) on a three-dimensional Hilbert space. A KS set thus constitutes a logical proof of the KS theorem [18], exemplified by the original result of Kochen and Specker [12]. It is also possible to demonstrate the inadequacy (rather than impossibility) of KS-noncontextual ontological models in a (weaker) statistical sense. Such statistical proofs of the KS theorem [19] are exemplified by the proof due to Klyachko et al. [20].
Prior to the Bell and Kochen-Specker theorems, Gleason's theorem [16] demonstrated that, for any quantum system of dimension at least three, the unique way to assign probabilities to the outcomes of projective measurements is via the Born rule. In particular, Gleason's theorem excludes any deterministic probability rule given by a {0, 1}-valued assignment of probabilities to all the selfadjoint projections on the system's Hilbert space. This exclusion thus implies the KS theorem, but, unlike the proof of Kochen and Specker [12], it requires an uncountably infinite KS set.
A single qubit does not support any of the three theorems (Bell, KS, or Gleason) and is, by that token, rather "classical". 3 A single qutrit can support the Gleason and KS theorems but not Bell's theorem. Hence, the smallest quantum system on which one can meaningfully study the interplay of Gleason, Bell, and KS theorems is a two-qubit sys-tem. The nonclassicality witnessed by Bell's theorem in qubit pairs (and more generally) clearly depends on entanglement, since Bell inequality violations can only be observed when the quantum systems used are described by an entangled state. Is entanglement, however, also necessary for the theorems of KS and Gleason in a two-qubit system, and more generally, in multiqubit systems? Since both theorems appeal to the structure of quantum measurements, we need to go beyond states and consider the role of entanglement in measurements as well.
The question of entanglement and Gleason's theorem is already resolved. A result by Wallach [27] showed that the set of unentangled multiqudit projections yields a proof of Gleason's theorem if and only if each qudit has a Hilbert space dimension three or more. In particular, the set of unentangled multiqubit projections cannot yield Gleason's theorem.
This work will largely consider rank-one projective measurements that are unentangled, i.e. measurements in bases comprising only product vectors. Unlike quantum states, measurements can be "nonlocal" without being entangled, that is, there exist measurements which cannot be implemented via local operations and classical communication (LOCC) but nevertheless only involve projections onto unentangled subspaces. 4 For example, one cannot perform a measurement in the unentangled three qubit basis { |000 , |+10 , |0 + 1 , |10+ , via LOCC. It could in principle be that "nonlocality" of unentangled measurements is sufficient to recover a logical proof of the KS theorem [12,18]. However, in the first main result of this work, Theorem 3, we show that this is not the case: the unentangled rays of a multiqubit system are KS-colourable and, 4 The eigenstates for such a measurement exhibit a phenomenon called 'nonlocality without entanglement', i.e. they form a set of mutually orthogonal states that cannot be distinguished perfectly via LOCC [28]. Figure 1: The Peres-Mermin square [29] consisting of two-qubit Pauli matrices together with the contextuality scenario defined by the orthogonality relations between Peres's 24 rays [30]. Each row or column of the Peres-Mermin square is associated to a two-qubit orthonormal basis in which all the operators in that row or column are diagonal. In the contextuality scenario, the six dashed hyperedges denote the six orthonormal bases corresponding to the rows and columns of the Peres-Mermin square. In particular, the basis that diagonalises the third column, {XX, Y Y, ZZ}, is the Bell basis.
thus, no unentangled form of nonclassicality suffices for a logical proof of KS-contextuality in this setting.
The well-known proof of the KS theorem via the Peres-Mermin square [31,30,29] at first appears to not involve entanglement (see Fig. 1). However, the joint measurement of the Pauli observables {XX, Y Y, ZZ}, for example, necessitates a measurement in the Bell basis. In this sense, our result shows that this entanglement is not accidental but, rather, unavoidable; any logical (hence, stateindependent) [18] proof of the KS theorem for multiqubit systems necessarily requires entangled measurements.
We also construct a KS-noncontextual ontological model for the fragment of multiqubit quantum theory containing unentangled measurements and product states. The model can be viewed as a generalization of the single-qubit model of Kochen and Specker [12], but the proof of its validity relies on our result-Lemma 2-to show that the on-tic states are indeed valid KS-noncontextual ontic states. The model admits a simple extension to the case of separable states which renders it preparation contextual [21], but still KS-noncontextual.
The existence of this model implies that for a proof of the KS theorem on a multiqubit system, one requires either (i) entanglement in the measurements, in which case one can provide a logical (and state-independent) proof, or (ii) entanglement in the state, in which case the violation of Bell inequalities, for example, provides a state-dependent proof without any entangled measurements [32]. In the second main result of this paper, we demonstrate that such Bell inequality violations are the only way to prove the KS theorem in this setting. We show that an entangled state enables a (finite) statistical proof of the KS theorem with unentangled measurements if and only if it also violates a Bell inequality with local projective measurements.
Thus, we must conclude that, just as in the case of Bell's and Gleason's theorems, the nonclassicality of multiqubit systems is underpinned by entanglement in the case of the Kochen-Specker theorem. This discovery is in surprising contrast to the usual intuition that takes the KS theorem as witnessing nonclassicality that is independent of entanglement because it applies, for example, to a single qutrit [12]. Hence, in the simplest case where all three theorems apply, i.e. a two-qubit system, entanglement is necessary for all of them.
Exploring the relationship between the KS theorem and entanglement in multiqudit systems further, we provide two new constructions of KS sets on multiqudit systems that allow us to obtain an overall picture of this relationship (Fig. 2). As displayed in Fig. 2, our results recover the result of Wallach [27] in the multiqubit case as a corollary, namely, that entangled projections are necessary to obtain Gleason's theorem.
Finally, we discuss implications of our results for the role of contextuality in multiqubit quantum computation with state injection. In particular, we highlight how the choice of measurements in the schemes of Ref. [13] ap- Figure 2: The existence of logical proofs of the KS theorem and proofs of Gleason's theorem without entanglement on a Hilbert space H 1 ⊗ · · · ⊗ H n , where 1 ≤ j ≤ n. The implication arrows show which results follow from each other. The addition of +direct signifies the result also holds for the subset of unentangled measurements given by direct products bases, see Sec. 7. The results in bold and purple are introduced in the present work.
pear natural in view of our results.
The structure of the paper is as follows. In Sec. 2, we provide preliminary notions that we will need in the rest of the paper. In Sec. 3, we prove our first main result, i.e. the necessity of entanglement for multiqubit KS sets. Sec. 4 presents a KS-noncontextual ontological model for product states and unentangled measurements of multiqubit systems. Sec. 5 establishes our second main result: an entangled state yields a statistical proof of the KS theorem if and only if it yields a proof of Bell's theorem. In Sec. 6, we construct a two-qubit KS set without any fully entangled bases. Sec. 7 proves the existence of unentangled KS sets on any multiqudit system that contains at least one qudit of Hilbert space dimension three or more. Sec. 8 discusses implications of our results for the role of contextuality in multiqubit schemes for quantum computation with state injection (QCSI). Sec. 9 concludes with a discussion of our results.

Preliminaries
Given a separable Hilbert space, H, a selfadjoint projection on H is a linear operator Π satisfying Π 2 = Π † = Π, where Π † denotes the adjoint of Π. We will denote the set of self-adjoint projections on H by P(H). In this work, we will consider projective measurements on quantum systems, which we de-scribe by sets {Π 1 , Π 2 , . . .} of self-adjoint, mutually orthogonal projections Π j ∈ P(H) on a separable Hilbert space, H, that sum to the identity operator.
Rank-one projections are generally sufficient for our purposes so we equivalently consider rays in the projective Hilbert space R(H). The projective Hilbert space R(H) is given by the set of equivalence classes, or rays, of non-zero vectors in H, under the equivalence relation ψ ∼ χ if and only if ψ = αχ for some non-zero α ∈ C. We represent each ray with one of its unit vectors, which we denote by a "ket", such as |ψ . For brevity, we will often refer to this vector as the ray it represents.
There is a bijection between the sets of rays and rank-one projections on a Hilbert space. A rank-one projective measurement is therefore uniquely defined by a complete set of rays, i.e. a set of d mutually orthogonal rays, where d is the (possibly infinite) dimension of the Hilbert space. A complete set is represented by an orthonormal basis {|ψ 1 , |ψ 2 , . . .} of H, and thus we say we are performing a measurement in a basis and often refer to a complete set of rays as simply a basis.
The KS theorem states the impossibility of an outcome-deterministic and measurement noncontextual ontological model [21] (briefly, a KS-noncontextual ontological model) for quantum theory. We will refer to this fact as the KS-contextuality of quantum theory. The original result due to Kochen and Specker [12] provides a logical proof of the KS theorem [18] and we state it below in a formulation most relevant to the present work: Theorem 1 (Kochen-Specker [12] This formulation of the result implies the traditional statement of the Kochen-Specker theorem in terms of valuation functions on self-adjoint operators (see Appendix A). The KS theorem also admits statistical proofs, whereby it is shown that the statistics produced by a collection of measurements performed on a fixed quantum state cannot be derived from a KS-noncontextual ontological model [20,19]. 5 We will discuss these proofs of the KS theorem, and how they differ from logical proofs, more extensively in Sec. 5.
A set of rays in a Hilbert space can be represented by a hypergraph in which there is a vertex for each ray and each complete set of rays constitutes a hyperedge. Such a hypergraph is an example of a contextuality scenario [32]. The map c in Theorem 1 then defines a special type of 2-colouring of this hypergraph, which we will call a KS-colouring. 6 A set of rays for which there does not exist a KS-colouring is known as a KS set.
A KS-colouring of a set of rays is more than simply a mathematical tool: KS-colourings define the ontic states in a KS-noncontextual ontological model. The ontic state determines the outcome of any measurement with certainty, and this choice of one deterministic outcome from each measurement is exactly a KS-colouring. The existence of a KS set in any Hilbert space of dimension greater than two then proves the KS theorem (via Theorem 1) The fact that all the rays in a Hilbert space, H, of dimension three or more form a KS set (i.e., a proof of the KS theorem) follows from Gleason's theorem. Kochen and Specker [12], 5 The notion of contextuality was extended beyond the Kochen-Specker notion to a generalised notion of contextuality in Ref. [21]. KS-noncontextuality, within this generalised framework, is recovered as a conjunction of two assumptions on ontological models of any operational theory: measurement noncontextuality and outcome determinism for projective measurements. As we will not use much of the machinery of generalised contextuality in this paper (and since we are focusing on quantum theory rather than operational theories in general), we refer the interested reader to Refs. [21,18,23,19,33,34] for discussions of generalised contextuality and its connection with KS-contextuality. 6 Here "0" and "1" stand in for two possible "colours" that could be assigned to vertices according to the rule specified in Theorem 1.
however, gave an explicit construction of a finite KS set in a three-dimensional Hilbert space, arguably providing a much simpler proof of the KS theorem than relying on Gleason's theorem. 7 Theorem 2 (Gleason [16]). Let H be a separable Hilbert space of dimension at least three.
for any set of mutually orthogonal projections for some density operator ρ on H.
The maps f in Gleason's theorem are known as frame functions. Any KS-colouring, c, on the rays (and equivalently the rankone projections) of a Hilbert space H would extend to a {0, 1}-valued frame function on P(H) via c(Π 1 +Π 2 +· · · ) ≡ c(Π 1 )+c(Π 2 )+· · · for mutually orthogonal sets of rank-one projections {Π 1 , Π 2 , . . .}. In dimensions greater than two, however, Gleason's theorem shows that such a frame function does not exist, hence the set of all rays is not KS-colourable.
Neither the KS theorem nor Gleason's theorem hold for a single two-level system, i.e. a qubit. The theorems do, however, hold for systems of multiple qubits. We will examine whether the onset of the applicability of these theorems is due to the presence entanglement in the rays of the measurements or if it is due to a weaker notion of nonlocality (without entanglement) [28]. Specifically, if we describe a projective measurement on a composite Hilbert space H 1 ⊗ · · · ⊗ H n by a sequence of projections Π 1 , Π 2 , . . ., we say the measurement is unentangled if the support of Π k admits a basis of product vectors, i.e. vectors ψ = ψ 1 ⊗· · ·⊗ψ n ∈ H, where ψ j ∈ H j for 7 Hrushovski and Pitowsky [35], however, showed that Gleason's theorem, when combined with the compactness theorem of first-order logic, implies the existence of a finite KS set. Hence, Gleason's theorem implies not only the KS theorem but something stronger, i.e., the existence of finite KS sets. all j. Rank-one unentangled measurements are sufficient for our argument. Each such measurement can be described by a complete set of product rays, that is, rays consisting of product vectors.
In the case of Gleason's theorem, Wallach [27] showed that for a multiqudit system in which each subsystem has dimension at least three, Gleason's theorem can be proved using only rank-one unentangled projections. Specifically, frame functions on rank-one unentangled measurements can always be described by the Born rule, as in Eq. (4). However, if even one subsystem has dimension two, the result fails to hold, i.e. there exist frame functions on unentangled projections which cannot be described as in Eq. (4).
To address the case of the KS theorem, we will be interested in rays in C 2 , or qubit rays, and rays in C 2 ⊗n , or n-qubit rays. Given a pair of orthogonal unit vectors {|0 , |1 } in C 2 a generic qubit ray can be expressed as for some 0 ≤ θ ≤ π and 0 ≤ φ < 2π. The parameters θ and φ define a point on the Bloch sphere (see Fig. 3). The pairs of orthogonal rays in C 2 are given exactly by the pairs of antipodal points of the Bloch sphere. The product rays in C 2 ⊗n are then rays admitting an expression |ψ = |ψ 1 ⊗ · · · ⊗ |ψ n where |ψ j are qubit rays for all 1 ≤ j ≤ n.

Kochen-Specker theorem for multiqubit systems
In this section, we will show that any logical proof of the KS theorem on a multiqubit system requires entangled measurements, i.e.

Theorem 3. Any multiqubit Kochen-Specker set necessarily contains entangled rays.
We consider a KS-colouring c n on product rays of C 2 ⊗n defined in terms of a KScolouring c 1 on the rays of C 2 . Specifically, c n (|ψ 1 ⊗ · · · ⊗ |ψ n ) = n j=1 c 1 (|ψ j ). We choose c 1 to be particularly easy to visualise KS-colouring but the argument follows for any choice. To specify our KS-colouring, we first define a north qubit ray: where 0 ≤ θ < π/2 and 0 ≤ φ < 2π or θ = π/2 and π < φ ≤ 2π. We denote the set of north rays, depicted on the Bloch sphere in Fig. 3, by N .
Note that, for any north qubit ray |ψ , the ray |ψ ⊥ satisfying ψ|ψ ⊥ = 0 is not a north ray, and vice versa, i.e. every pair of orthogonal qubit rays contains exactly one north ray. This definition naturally extends to the case of n-qubit rays: An all-north n-qubit ray is a product ray |ψ = |ψ 1 · · · |ψ n such that |ψ j ∈ N for all j ∈ {1, . . . , n}. We denote the set of all-north n-qubit rays by N n .
We need the following two lemmas in order to prove our first main result: Lemma 1. Any complete set of two-qubit product rays contains exactly one all-north ray.
Proof. A generic two-qubit product basis takes the form for |ψ 1 , |ψ 2 , |ψ 3 ∈ C 2 , noting that the systems may be swapped. By inspection, one can see that exactly one of the rays in Eq. (7) is all-north.
The proof of the following lemma is based on that of Prop. 1 in [36].

Lemma 2.
Any complete set of n-qubit product rays contains exactly one all-north ray.
Proof. We will prove this statement by induction on the number of qubits.
Firstly, an n-qubit complete set can contain at most one all-north ray since no two allnorth rays are orthogonal. Now, assume the result holds for m qubits and consider a complete set of (m + 1)-qubit product rays (8) where |Ψ j ∈ C 2 ⊗m and |ψ j ∈ C 2 for j ∈ 1, . . . , 2 m+1 . Let B denote the set of distinct rays of the final qubit, i.e. B = |ψ ∈ C 2 |Ψ |ψ ∈ P for some |Ψ ∈ C 2 ⊗m , (9) noting that we may have |ψ j = |ψ k for j = k, and let E be a maximal set of nonorthogonal rays from B, i.e. each ray in B\E is orthogonal to some ray in E. Denote by J the subset of 1, . . . , 2 m+1 such that |ψ j ∈ E for all j ∈ J.
(10) We may assume (without loss of generality) that the set E is chosen such that for all |ψ ∈ E, since if µ(|ψ ) < µ(|ψ ⊥ ) then |ψ could be replaced by |ψ ⊥ in E without violating the requirements on the set E. The maximality of E implies that for ev- (12) and µ(E) ≥ µ ⊥ (E) by the assumption in Eq. (11).
If |ψ j , |ψ k ∈ E for j = k then Ψ j |Ψ k = 0, since ψ j |ψ k = 0. Hence the set comprises µ(E) mutually orthogonal rays of and µ(E) ≤ µ ⊥ (E). Therefore, we find Furthermore, it then follows from Eq. (11) that for all |ψ ∈ E and hence for all |ψ ∈ B (since, by the maximality of E, if |ψ ∈ B is not contained in E then |ψ ⊥ ∈ E). The definition of µ is independent of E, and we find that assumption (11) holds with equality for any choice of E. We may now choose E to consist entirely of north rays: given any maximal set E of non-orthogonal rays from B we have by Eq. (16) that |ψ ⊥ is in B for any |ψ ∈ E and therefore all |ψ ∈ E that are not north rays can be exchanged with the north rays |ψ ⊥ to produce the all-north set E.
We also find from Eq. (15) that µ(E) = 2 m = |J| so that the set of Eq. (13) is a complete set of product rays in C 2 ⊗m . By our assumption (for the proof by induction), this m-qubit complete set contains exactly one allnorth ray, say |Ψ k . The ray |Ψ k |ψ k ∈ P is then an all-north ray since k ∈ J implies that |ψ k ∈ E, which is an all-north set of rays. Since P can contain at most one all-north ray, it contains exactly one all-north ray.
By Lemma 1, since the statement of Lemma 2 holds for n = 2, it holds for all n ∈ N by induction.
We can now prove Theorem 3.
Proof. We will show the contrapositive: the contextuality scenario generated by all nqubit product rays is KS-colourable. Define c n : Σ C 2 ⊗n → {0, 1} as follows.
By Lemma 2, every complete set of n-qubit product rays contains exactly one all-north ray, hence c n defines a KS-colouring on the contextuality scenario generated by all nqubit product rays.

A Kochen-Specker noncontextual ontological model for unentangled nqubit systems
In the previous section we demonstrated the existence of KS-colourings of the unentangled rays of n-qubit systems. Such colourings define ontic states in a KS-noncontextual ontological model. However, the existence of such ontic states alone is insufficient to conclude that some quantum statistics can be reproduced by such a model. In this section we will show how these ontic states can, indeed, yield a KS-noncontextual ontological model for the fragment of quantum theory consisting of n-qubit product states and n-qubit unentangled measurements. We do so via an extension of the single qubit model of Kochen and Specker [12] using the formulation of Leifer [37].
Before we proceed further, we note that if we were to restrict our analysis to measurements that can be performed locally on each qubit-either only measurements in direct product bases (see Eq. (42) or Ref. [47]) or, more generally, those that can be implemented via LOCC-construction of a KSnoncontextual model would not require the results from the previous section.
However, we are considering the strictly larger class of all unentangled measurements (see Sec. 2) including those that cannot be implemented via LOCC, for example, a measurement in the basis of Eq. (1). Thus, our KS-noncontextual ontological model requires KS-colourings of all unentangled multiqubit measurement bases and not only those implementable via LOCC. With these clarifications out of the way, we can now proceed to detail our construction.
Given a unit vector |ψ in Hilbert space we will denote by [ψ] the rank-one projection onto the subspace spanned by |ψ . The ontic state space for the model is given by Λ = S 2 1 × · · · × S 2 n , where S 2 j is a two-dimensional sphere representing the ontic state space of the jth qubit (j = 1, 2, . . . , n) and × is the Cartesian product. We represent each n-qubit ontic state λ = (λ 1 , λ 2 , . . . , λ n ) ∈ Λ by a single vector λ in a real vector space R 3n : where 0 ≤ θ j ≤ π and −π < ϕ j ≤ π are the coordinates of λ in the jth copy of S 2 . Similarly, we may describe an unentangled rank-one projection [ψ] = [ψ 1 ⊗· · ·⊗ψ n ] by a vector where 0 ≤ θ ψ j ≤ π and −π < ϕ ψ j ≤ π are the spherical coordinates of ψ j on the jth sphere S 2 .
The following inequalities will be useful in proving that our ontological model reproduces quantum theory: for all unentangled n-qubit projections [ψ] and λ ∈ Λ, where H 0,1 : R → {0, 1} are two conventions for the Heaviside step function, We now define the epistemic states of the model. Given that a product state [χ] = [χ 1 ⊗ χ 2 ⊗ · · · ⊗ χ n ] (which can be described by a vector as in Eq. (19) with parameters 0 ≤ θ χ j ≤ π and −π < ϕ χ j ≤ π) is prepared, the probability measure over the ontic states of the system is given by Thus the probability of observing outcome [ψ] of a measurement M performed on a system prepared in quantum state [χ] is given by Since the terms of the integrand are all nonnegative, it follows from Eq.
For both of these bounds (y ∈ {0, 1}) we find We can now evaluate each term in this product for both cases y = 0 and 1, following the method of Ref. [37] (where the y = 0 case is considered), to find for both y = 0, 1. The argument goes as follows. Firstly, we choose our coordinates θ j and ϕ j such that χ j = (1, 0, 0) and ψ j = (cos φ, sin φ, 0) for some −π < φ ≤ π. Then since λ j = (sin θ j cos ϕ j , sin θ j sin ϕ j , cos θ j ), we find The integrand of Eq. (32) is non-zero when χ j · λ j is positive and when ψ j · λ j is positive (non-negative) for the case y = 0 (y = 1). These conditions are achieved for y = 0 when −π/2 < ϕ j < π/2 and −π/2+φ < ϕ j < π/2+ φ and similarly for y = 1 but when the latter inequalities are no longer strict, i.e. −π/2 + φ ≤ ϕ j ≤ π/2 + φ. Since the integrals over a closed or open interval are equal, in both cases y = 0, or 1, if φ is non-negative we find We find the same value if φ is negative. Finally, we have shown The fact that this extension is well-defined follows from Lemma 2. Thus the model reproduces the predictions of quantum theory for n-qubit product states and unentangled measurements.
Note that for the case n = 1, this ontological model is equivalent to the Kochen-Specker model for projective measurements and pure states of a single qubit [12,37]. Our generalization, however, is nontrivial because it crucially relies on Lemma 2 to define the n-qubit response function of Eq. (20). For a single qubit, on the other hand, Lemma 2 is trivial because every qubit state appears in exactly one basis which already implies that each basis contains exactly one north state.
The model can be extended to include mixtures of product states, e.g. N i=1 q i [χ i ], for product states |χ i by simply defining the probability measure for such a mixture as the same mixture of probability measures, i.e. N i=1 q i µ χ i . However, note that such an extension is preparation contextual, in the sense that there is no fixed probability measure for a given separable density operator since different decompositions of the same density operator as a mixture of product states result in different probability measures. Indeed, there exists no preparation noncontextual extension to mixed states, as follows from the impossibility of a preparation noncontextual ontological model for single qubit mixed states [21].
An immediate consequence of the existence of this KS-noncontextual ontological model is that it allows us to obtain a tight statement on the relationship between entanglement and KS-contextuality in multiqubit systems. Namely, entanglement is necessary for any proof of the KS theorem, whether logical [12,18] or statistical [20,19]. Theorem 3 showed that multiqubit entangled measurements are necessary for logical proofs of the KS theorem. Our construction of the KSnoncontextual ontological model implies that even for statistical proofs of the KS theorem on multiqubit systems, one requires entanglement, either in the state or in the measurements. A well-known instance of such a statistical proof is the violation of Bell inequalities by implementing local (hence, unentangled) measurements on an entangled multiqubit state. One can also construct statistical proofs of the KS theorem (where there exist KS-colourings of the measurements involved) in which the quantum state is a product state and there is entanglement in the measurements. For example, given a multiqubit entangled state that violates a Bell inequality, one can apply a global unitary to the state making it a product state and then apply the same unitary to all the measurements involved in the violation. This transformation preserves the probabilities, the orthogonality relations, and hence, the statistical proof, whilst inevitably making some of the measurements entangled. uct pure states of multiqubit systems. The model straightforwardly extends to separable states, allbeit in a preparation contextual way. It follows that to obtain a multiqubit statistical proof of the Kochen-Specker theorem from unentangled measurements one must employ an entangled multiqubit state. 8 Is entanglement of the multiqubit state, however, sufficient to yield such a statistical proof? In this section, we find the answer to be negative, i.e. there must exist an KS-noncontextual ontological model that includes-besides separable states and unentangled projective measurements-certain entangled states. Specifically, we find that the multiqubit states that can demonstrate KS-contextuality with unentangled measurements (in a finite contextuality scenario) are exactly those that can violate a Bell inequality with local projective measurements.
The fact that there exist entangled multiqubit states yielding a statistical proof of the KS theorem with only unentangled measurements follows from the existence of entangled multiqubit states that violate Bell inequalities with only local projective measurements. This implication can be seen from both the more traditional observable-based perspective [39], as well as the event-based perspective [32,40].
In the observable-based perspective we consider the local observables of each party in a Bell experiment. The set of local observables corresponding to one choice of setting for each party commute pairwise and, thus, form a context. We may then consider the KS-noncontextual ontological models respecting all such contexts. The mathematical constraints defining these models in this situation are equivalent to the constraints imposed 8 Note the contrast between preparation contextuality and KS-contextuality for multiqubit systems: the former requires no entanglement but for the latter, entanglement is necessary. It is known that any preparation noncontextual ontological model of a fragment of quantum theory is also necessarily KS-noncontextual but not conversely [38,23]. We can thus conclude that in any ontological model of multiqubit systems, KScontextuality implies both, entanglement and preparation contextuality. by Bell's assumption of local causality in the original Bell scenario. 9 Hence, a quantum violation of a Bell inequality using projective measurements also provides a set of quantum statistics deriving from local, and hence unentangled, measurements that are incompatible with a KS-noncontextual ontological model.
To prove our result the event-based perspective will be more convenient. In order the see the connection between Bell scenarios and KS-contextuality in this perspective one needs to go beyond the idea of KS-colourings of contextuality scenarios to that of probabilistic models on contextuality scenarios following the framework of Ref. [32].
In the following subsection, we define probabilistic models on contextuality scenarios and related notions, then show how a proof of Bell's theorem yields a statistical proof of the KS theorem (a known connection). In the next subsection, we then show that the converse relationship also holds for multiqubit systems and unentangled measurements. Thus, we arrive at the second main contribution of this work: a multiqubit entangled state can yield a statistical proof of the KS theorem with unentangled measurements if and only if it violates a Bell inequality with local projective measurements.

Bell implies KS
A probabilistic model is a probability assignment to the vertices of a contextuality scenario (i.e. a hypergraph) such that the probabilities assigned in each hyperedge sum to one. Explicitly, given a contextuality scenario

tic models p and p is the probabilistic model q(v) = ωp(v) + (1 − ω)p (v) for some ω ∈ [0, 1] and all v ∈ V (H).
A probabilistic model p is quantum if and only if for some separable Hilbert space H there exists (i) a projection Π v for every v ∈ V (H) such that v∈e Π v = I H for all e ∈ E(H), and (ii) a density operator ρ on H, such that p(v) = Tr(Π v ρ).
A quantum model that is not classical is said to provide a statistical proof of the KS theorem [20,19]. Such nonclassicality can be witnessed by the violation of a Bell-KS inequality that bounds the polytope of classical models, e.g. the KCBS inequality [20]. On the other hand, if a contextuality scenario admits no classical models (hence no KScolourings) but it admits a quantum model, we have a logical proof of the KS theorem [12,18] which demonstrates a stronger form of KS-contextuality: not only does there exist a quantum model outside the set of classical models, the set of classical models is, in fact, empty, i.e. no KS inequalities exist and every probabilistic model (hence every quantum model) on the contextuality scenario fails to be classical.
A Bell scenario is an experiment in which n parties each perform a measurement x r and observe an output a r for 1 ≤ r ≤ n on some system such that the parties' individual experiments are space-like separated. We call a measurement on the entire system performed by all n parties a global measurement. Each global measurement is specified by a list of the local measurement settings x = x 1 x 2 . . . x n and has a set of possible outcomes a|x, where a = a 1 a 2 . . . a n . We consider the case in which each local measurement has two possible outcomes, i.e. a r ∈ {0, 1} for all 1 ≤ r ≤ n. This is the natural setting for a multiqubit Bell experiment where the parties implement local projective measurements.
A behaviour in a Bell scenario is a probability distribution, p(a|x), on the measurement outcomes. A local behaviour is a behaviour that can be decomposed into a convex combination of local deterministic behaviours, i.e. of behaviours p(a|x) = p 1 (a 1 |x 1 ) · · · p n (a n |x n ), where p r (a r |x r ) ∈ {0, 1} denotes the probability of party r observing a r given they performed measurement x r for all 1 ≤ r ≤ n.
A behaviour that can be realised by each party performing a projective measurement on some subsystem of a quantum system is called a projective quantum behaviour. For a projective quantum behaviour, p(a|x), there exists a separable Hilbert space, H = H 1 ⊗ · · · ⊗ H n , a projective measurement {Π r,xr ar } ar on H r for each measurement setting x r of the r-th party for all 1 ≤ r ≤ n, and a density operator ρ on H such that p(a|x) = Tr( n r=1 Π r,xr ar ρ). A Bell scenario can be mapped to a contextuality scenario which has (i) a vertex for every possible global measurement outcome a|x for all a and x, (ii) a hyperedge consisting of all the possible outcomes a|x of fixed global measurement x, i.e. each set {(a|x)|a r ∈ {0, 1} for all 1 ≤ r ≤ n} is a hyperedge, and (iii) a hyperedge deriving from each no-signalling condition. The nonsignalling hyperedges consist of sets of possible outcomes of (hypothetical) adaptive measurements in which one party performs a measurement first and depending on their outcome a second party selects a measurement setting and so on [32]. For example, the hyperedge {00|00, 01|00, 10|01, 11|01} features in the hypergraph of the two party Bell scenario with binary inputs and outputs. This hyperedge can be thought of as the outcomes of an adaptive measurement protocol in which the first party always chooses measurement setting 0, and the second party chooses the measurement setting 0 if the first party obtains outcome 0 and 1 if they obtain 1. 10 There is a bijection between local behaviours in the Bell scenario and classical models in the corresponding contextuality scenario. Furthermore, each quantum behaviour deriving from projective measurements in the Bell scenario defines a quantum model in the contextuality scenario. A collection of multiqubit projective measurements and a density operator that violate a Bell inequality then generate a quantum model on the corresponding contextuality scenario that is not a classical model and thus yields a statistical proof of the KS theorem. For a detailed account, see Ref. [32].

KS implies Bell
We have seen that states that can violate Bell inequalities also yield statistical proofs of the KS theorem with unentangled measurements. We now show that the converse relationship also holds. Namely, for any collection of product multiqubit rays and an entangled state which yield a statistical proof of the KS theorem, the entangled state necessarily violates a Bell inequality.
The proof will proceed along the following lines. Consider a hypergraph H with a nonclassical quantum model given by some multiqubit product rays, S, and a density operator ρ. We will expand the set of multiqubit rays S to a set of rays S that correspond to all the outcomes of a quantum measurement strategy in some Bell scenario, B(S). For example, if S consisted of {|00 , |+1 , |0+ } it would be expanded to which contains all the rays corresponding to the outcomes of measurements in a two-party Bell scenario where each party has two possible measurement settings given by {|0 , |1 } and {|+ , |− }. We can then extend the hypergraph, H, to the hypergraph G ⊇ H generated by this extended set of rays and their orthogonality relations. We can also extend the quantum model on H given by S and ρ to a quantum model on G given by S and ρ; this quantum model on G continues to be non-classical. Let H be the hypergraph corresponding to the Bell scenario B(S). The hypergraph G will contain all the vertices and hyperedges of the hypergraph H but possibly with additional hyperedges deriving from nonlocal bases in S such as the basis in Eq. (1). The set S and state ρ also give a quantum model on the Bell hypergraph H , since every hyperedge of H is contained in G. Finally, we will show, following [45], that the sets of classical models on G and H are identical despite the difference in hyperedges. Thus, the non-classical model on G is also non-classical on H , meaning ρ yields a non-local, projective quantum behaviour in B(S). This leads us to the following theorem:

Theorem 4. Any multiqubit density operator, ρ, that can yield a statistical proof of the KS theorem with a finite set of unentangled projective measurements can violate a Bell inequality with local projective measurements.
A full proof of Theorem 4 is given in Appendix B.
It follows that it must be possible to extend our KS-noncontextual ontological model to include, besides separable states, multiqubit entangled states (e.g. Werner states) that cannot yield Bell violations when subjected to local projective measurements. We leave the construction of such a multiqubit KSnoncontextual ontological model as an open problem for future work.

Do we need fully entangled bases?
Our initial question concerning the necessity of entanglement in any multiqubit logical proof of the KS theorem [12] was motivated by the presence of entanglement in the Peres-Mermin magic square (Fig. 1). Another curious feature of the Peres-Mermin argument is the appearance of bases that are not merely entangled but, in fact, fully entangled, i.e. these bases contain no product projections. Is there, then, an even stronger requirement on the entanglement needed for a multiqubit KS theorem, i.e. just as the presence of entangled projections in a KS set is generic and not particular to the Peres-Mermin case, must a multiqubit KS set always contain fully entangled bases? Here we rule out this possibility by explicitly constructing a two-qubit KS set that does not require fully entangled bases, i.e. every basis contains at least one product ray.
Our construction makes use of Peres' 33-ray proof of the KS theorem in C 3 which, when represented as a contextuality scenario, actually requires 57 rays (since the proof makes use of the orthogonality of certain pairs of the 33 rays for which the final orthogonal ray is missing). We construct a KS set by embedding two copies of the 57 Peres rays in two different three dimensional subspaces of C 2 ⊗C 2 .
Denote by |φ j , for integers 1 ≤ j ≤ 57, the rays in the Peres 57-ray contextuality scenario. The rays can be carved up into 40 distinct orthonormal bases.
We will now embed Peres' threedimensional rays in the three-dimensional subspace of the two-qubit Hilbert space orthogonal to the ray |00 . Given a ray to our hypergraph. We now perform the analogous operation in the subspace orthogonal to |01 , but we first transform Peres' rays by a unitary matrix where τ ± = 2 − √ 2 ± √ 6 /2, and denote the new set of rays |U φ j . Note that the orthogonality relations between the rays are preserved under this transformation. Explicitly, we add the basis to our hypergraph for every basis {|φ a , |φ b , |φ c }, where analogously The unitary U has been chosen such that φ 00 j |U φ 01 k = 0 for all entangled |φ 00 j and |U φ 01 k . It follows that the entangled rays of the hypergraph cannot be formed into a basis, i.e. there are no fully entangled bases in our construction. This fact can be verified by consulting the Mathematica notebook available at Ref. [46].
The contextuality scenario can now be seen to be KS-uncolourable as follows. Any KScolouring, c, of our hypergraph should assign zero to at least one of the rays |00 and |01 , since they appear in a hyperedge-the hyperedge consisting of {|00 , |01 , |10 , |11 } which is one of the bases given by Eq. (38). If |00 is assigned zero by the colouring, c, then the assignments to the rays |φ 00 j would constitute a KS-colouring of the Peres hypergraph, which does not exist. Similarly, if |01 is assigned zero in the colouring, c, then the assignments to the rays |U φ 01 j would also constitute a KS-colouring of the Peres hypergraph. Hence such a colouring, c, cannot exist.

Unentangled Kochen-Specker sets
Between the case of multiqubit systems (wherein each subsystem has dimension two) and the case of multiqudit systems, wherein each subsystem has dimension at least three, we have the possibility of multiqudit systems consisting of both qubits and higherdimensional qudits. For these systems, Wallach showed that unentangled projections are still insufficient to yield Gleason's theorem [27]. Is it, however, possible to obtain the KS theorem with unentangled projections in this case, unlike the multiqubit case? In this section, we show that this is indeed possible. In fact, the presence of just one qutrit in an otherwise multiqubit system is enough to allow constructions of KS sets with unentangled projections. Our argument, not surprisingly, relies on the fact that a single qutrit admits KS sets on its own [12].

Theorem 5. There exists a KS set consisting entirely of product rays in any separable
Here we give an idea of the proof with an example and give the general proof in Appendix C. We will show that if there were no KS set consisting of product rays in H 1 ⊗· · ·⊗ H n , i.e. if there existed a KS-colouring c on the product rays in H 1 ⊗ · · · ⊗ H n , then we would be able to use c to define a KS-colouring c on any set of bases of H j , contradicting the KS theorem. For example, assume c is a KScolouring of the product rays in Note that the proof of Theorem 5 in Appendix C actually shows the stronger result that the hypergraph of product rays of H 1 ⊗ · · · ⊗ H n with only the hyperedges deriving from direct product bases is also KSuncolourable. A direct product basis [47] is a basis formed by taking the elementwise tensor product of a basis for each subsystem H j , i.e. a basis {|ψ 1 1 |ψ 2 1 · · · |ψ n 1 , |ψ 1 2 |ψ 2 1 · · · |ψ n 1 , . . . , |ψ 1 The contrast between the impossibility of Gleason's theorem and the possibility of KS theorem with unentangled measurements in the case of composite systems that contain both qubits and higher-dimensional qudits may seem surprising at first glance. To gain some intuition for this contrast, consider two important facts: first, very simply, Gleason's theorem implies the KS theorem, but not conversely. Second, in more depth, the existence of a KS theorem but the lack of a corresponding Gleason's theorem reflects the fact that the former is a no-go theorem (ruling out certain types of probabilistic assignments) while the latter is a "go theorem" (specifying the allowed probabilistic assignments).
Consider, for example, a qubit-qutrit system. A single qubit does not permit a proof of Gleason's theorem because the structure of projective measurements on a qubit allows for many more probabilistic assignments than are dictated by the Born rule. Composing the qubit with a qutrit and considering unentangled measurements on the pair does not rule out these non-quantum assignments on the qubit, so Gleason's theorem continues to fail. On the other hand, a single qutrit admits a proof of the KS theorem and this no-go statement goes through even if we compose it with a qubit.

Implications for the role of contextuality in quantum computation
The role of contextuality within the paradigm of quantum computation with state injection (QCSI) has been a subject of active research in recent years [5,13]. 11 This approach to quantum computation relies on lifting stabiliser quantum circuits, which cannot implement universal quantum computation, to universality via the injection of non-stabiliser states called magic states. It has been shown that, in the case of odd-prime dimensional qudit (or, quopit [13]) circuits, contextuality is a necessary resource for quantum computation, i.e. the magic states needed for universality must exhibit contextuality with respect to stabiliser measurements [5]. Thus, if a state admits a KS-noncontextual ontological model with the stabiliser measurements, it cannot promote the circuit to universality. The converse claim, that (KS-)contextuality is sufficient for universal quantum computation, is conjectured but unproven [5].
The multiqubit case (i.e. the even-prime dimensional case) has, however, been a hurdle in interpreting contextuality of magic states as a resource for quantum computation [5]. The multiqubit stabiliser subtheory is classically efficiently simulable [49,50] despite the presence of (state-independent) KS-contextuality in its observables (e.g. see Fig. 1). The fact that any state, stabiliser or not, exhibits contextuality with respect to such observables means that there is nothing special about the contextuality of magic states that renders contextuality a necessary resource for universal quantum computation. While magic states are still a necessary resource, their contextuality does not single them out (unlike the quopit case) since all states can exhibit contextuality. 12 To overcome this hurdle, restricted QCSI schemes have been proposed [13] which restore the status of contextuality as a resource for quantum computation.
In Ref. [13], such a restricted QCSI scheme M O is required to satisfy: (C1) Resource character. There exists a quantum state that does not exhibit contextu-ality with respect to measurements available in M O .
(C2) Tomographic completeness. For any state ρ, the expectation value of any Pauli observable can be inferred via the allowed operations of the scheme.
The requirement (C1) means that the measurements in the scheme cannot exhibit stateindependent contextuality. It follows from Theorem 3 of the present manuscript that a sufficient condition for satisfying requirement (C1) is that every measurement in the scheme be unentangled. Specifically, the scheme M O prescribes the sets of observables in the scheme that are jointly measurable. If none of these joint measurements requires a measurement in an entangled basis then (C1) is satisfied.
The schemes proposed in Ref. [13] satisfy exactly this condition; they contain no entangled measurements. Thus, all the contextuality in these schemes derives from entanglement of the injected state. Furthermore, it follows from Theorem 4 that in order for the injected state to promote such a scheme to universality it must be capable of violating a Bell inequality with some local projective measurements.

Conclusions
In this work we have demonstrated the necessity of entanglement in proofs of the KS theorem for systems of multiple qubits, i.e., KS-contextuality necessitates not only incompatibility but also entanglement in multiqubit systems.
On the one hand, we showed KS sets for multiqubit systems must contain entangled rays/projections. It follows that any logical proof of the KS theorem for multiqubit systems relies upon entanglement in the measurements, just like in the case of the Peres-Mermin square (Fig. 1). However, unlike the Peres-Mermin square, multiqubit proofs of KS-contextuality do not, in general, require measurements in fully entangled bases.
On the other hand, the KS-noncontextual ontological model defined in Sec. 4 allows us to go further and also make statements about statistical (and state-dependent) proofs of the KS theorem, i.e. those proofs that are in the style of Klyachko et al. [20]. Specifically, this model can be easily extended to include unentangled projective measurements and separable states of a multiqubit system. Thus, any statistical proof of the KS theorem for such a system must employ entanglement, either in the state or in the measurements (or both). For example, proofs of Bell's theorem give rise to statistical proofs of the KS theorem in which the measurements are unentangled but the state is necessarily entangled [51,33].
Our results also allow us to make the following comparisons with other forms of nonclassicality.
Firstly, it follows from our results that the nonclassicality present in unentangled measurements in the form of 'nonlocality without entanglement' [28] is insufficient to witness the (KS-)contextuality of multiqubit systems.
Secondly, Bell's theorem follows from unentangled measurements and requires an entangled state (although entanglement is not sufficient [52]). Similarly, it follows from the model in Sec. 4 that statistical proofs of the KS theorem employing unentangled measurements also require entangled states for multiqubit systems. Moreover, in Theorem 4, we have shown that the subset of entangled sets that violate a Bell inequality with local projective measurements are exactly those which can yield a finite statistical proof of the KS theorem, where by 'finite' we mean that the proof uses a finite set of measurements.
Finally, our results clarify the connection between Gleason's theorem and the KS theorem with respect to entanglement. While both theorems hold in dimensions greater than two and fail to hold in dimension two, their behaviour with respect to entanglement differs in multiqudit systems. In any multiqudit system that contains both, a subsystem of dimension two and another of dimension at least three, the two theorems diverge. Un-entangled measurements are sufficient for a proof of the KS theorem but not Gleason's theorem for such multiqudit systems (Fig. 2).
We have also discussed the implications of our results for the program of understanding the role of contextuality in quantum computation. We find that the assumptions underlying some previously proposed QCSI schemes with qubits [13] become more intuitive in view of our results.
The questions we have raised and addressed in this paper have implications for both fundamental and applied aspects of quantum theory. Further development of the applied aspects, particularly with respect to quantum computation, will be taken up in future work. 754510. VJW acknowledges support from the Government of Spain (FIS2020-TRANQI

A Traditional KS
A traditional proof of the KS theorem comprises a set of self-adjoint operators on a Hilbert space that lead to a contradiction when one attempts to find a valuation (see Def. 3 below) on all such self-adjoint operators [12,53]. A subset of these operators that commute pairwise is known as a context, since all the observables in the subset are jointly measurable. In finite dimensions, and in particular for multiqubit systems, the operators in a context have a simultaneous orthonormal eigenbasis, i.e. an orthonormal basis in which every vector is an eigenvector of every operator in the context. In this appendix we will describe the relationship between the existence of valuations and KS-colourings. Thus, we demonstrate how Theorem 3 implies that for multiqubit systems a set of operators leading to a traditional proof of the KS theorem must contain a context for which the simultaneous orthogonal eigenbasis contains entangled vectors.
Let L sa (H) denote the self adjoint operators of a separable Hilbert space H. The requirement (FUNC) implies, for example, that for commuting operators A and B we have v(AB) = v(A)v(B) and v(aA+bB) = av(A) + bv(B) for a, b ∈ R. Therefore, if there did exist a valuation, v, on L sa (H) it would assign values zero or one to each rank-one projection and satisfy v(Π 1 ) + v(Π 2 ) + . . . = 1 for sets of orthogonal projections {Π 1 , Π 2 , . . .} summing to the identity. In other words, the valuation would define a KS-colouring when restricted to the rank-one projections of the Hilbert space. The non-existence of such a colouring, stated in Theorem 1, therefore implies the non-existence of a valuation.
In the case of multiqubit product rays we have found that KS-colourings do, however, exist. We may use these colourings to define valuations on certain subsets of self-adjoint operators.
We define a traditional proof of the KS theorem as a set A of self-adjoint operators such that there does not exist a map v on A satisfying (SPEC) and (FUNC), where (FUNC) is limited to Borel functions g such that g(A) ∈ A. 13 We find that such a set A must contain contexts for which the simultaneous orthogonal eigenbasis contains entangled vectors in multiqubit systems.
Lemma 3. Consider a set of self-adjoint operators A on C 2 ⊗n such that the simultaneous eigenbases for each context in A consist entirely of product vectors. Then A does not form a traditional proof of the KS theorem.
Proof. By Theorem 3, there exists a KScolouring, c, on the hypergraph H generated by the simultaneous eigenbases of the contexts in A. As in the treatment of Gleason's theorem in Sec. 2, this colouring can be equivalently defined on the rank-one projections corresponding to each ray and extended to higher rank projections via c(Π 1 +Π 2 +· · · ) ≡ c(Π 1 ) + c(Π 2 ) + · · · for mutually orthogonal sets of projections {Π 1 , Π 2 , . . .}. Given A ∈ A, let A = j λ j Π j be its spectral decomposition, where Π j are orthogonal projections and λ j are the eigenvalues of A. We will show the map satisfies SPEC and FUNC for all A ∈ A.
For a given operator A ∈ A, the colouring c takes value one on exactly one of the projections Π j and zero on the rest. It follows that v satisfies the (SPEC) principle. For any for some λ k ∈ σ(A). And further Therefore, v also satisfies the (FUNC) principle. Proof. Let be a set of m product rays in C 2 ⊗n , and let ρ be a density operator on C 2 ⊗n that yields a statistical proof of the KS theorem with the hypergraph, H, generated by S and all the bases contained in S. Explicitly, the hypergraph H has m vertices, v j for 1 ≤ j ≤ m, such that {v j |j ∈ J} ∈ E(H) if and only if {|ψ j |j ∈ J} is an orthonormal basis, where J ⊂ {1, . . . , m} is some indexing set. The non-classical quantum model on H is given by p S ρ (v j ) = ψ j |ρ|ψ j . We will show that ρ necessarily violates a Bell inequality.
Consider the set S r of distinct local rays of the r-th system that are pairwise nonorthogonal. Explicitly, where j ∈ J r ⊆ {1, . . . , m} if and only if |ψ r j = |ψ r k and ψ r j |ψ r k = 0 for all k < j. Now we will relabel the states in S r as follows Let B (S) be the n party Bell scenario in which the r-th party has |S r | binary measurement settings. Consider the quantum strategy in this scenario in which the r-th party's k-th measurement is given by where |1 r k ∈ C 2 is the ray orthogonal to |0 r k . For a r ∈ {0, 1}, we denote the n-qubit ray n r=1 |(a r ) r xr by |(a|x) -in correspondence with its outcome in the Bell scenario. Note that the global measurements, {|(a|x) } a , in this Bell experiment contain all the rays in S. Let S denote the extended set (compared to S) of rays {|(a|x) } a,x .
Denote by H the contextuality scenario corresponding to the Bell scenario B(S). Recall that each hyperedge of H is given by the outcomes of an adaptive measurement. Equivalently, under the assignment of the rays from S to each vertex, each hyperedge corresponds to a projective measurement measurement that can be performed via LOCC. In other words, the hypergraph H is generated by S and those orthonormal bases that can be implemented via LOCC rather than all possible orthonormal bases. It follows that, although the vertices of H are a superset of those of the original contextuality scenario, H, there could be hyperedges in H which are not contained in E(H ) since they arise from "non-local" bases in H, such as the basis in Eq. (1). Denote by G the contextuality scenario generated by rays S and all the bases, which gives V (G) = V (H ) and

E(G) ⊇ E(H ) ∪ E(H).
The entangled state ρ combined with the assignment of the ray |(a|x) to each vertex a|x gives a quantum model p ρ on both the hypergraphs, G and H . Further, we have that p ρ is a non-classical model on G since it is a non-classical model on a subset of G, namely, H 14 . Now, we will show that the classical 14 The G can only add further constraints on the models on G and H coincide exactly, meaning that p ρ is also a non-classical model on H . Given the bijection between classical models on H and local behaviours in the Bell scenario B(S), it follows that ρ violates a Bell inequality.
Clearly, a classical model on G is a classical model on H since there are fewer constraints on the classical models on H due to the reduced set of hyperedges. We now show the converse, using an argument from [45, Theorem 2]. Given an edge e = {(a j |x j ) |1 ≤ j ≤ 2 n } of G we have that for any pair of distinct vertices, a j |x j and a k |x k , in the edge the corresponding pair of rays, |(a j |x j ) and |(a k |x k ) , are orthogonal by the definition of G. Since these rays are n-qubit product rays, we find that for at least one of the single-qubit subsystems, the qubit ray in |a j |x j must the orthogonal to qubit ray of the same subsystem of |a k |x k . By construction, for each qubit ray that occurs in a given subsystem the orthogonal ray only occurs as the other outcome of the same measurement setting, cf. Eq. (50). If this orthogonality occurs in the r-th subsystem, explicitly, we find (x r ) j = (x r ) k and (a r ) j = (a r ) k . Thus, we find in the two events a j |x j and a k |x k of the Bell scenario, party r has the same measurement setting (x r ) j = (x r ) k but observes a different outcome, either (a r ) j or (a r ) k , i.e. the two events are locally orthogonal in the terminology of Ref. [54].
It follows that for any local deterministic behaviour in the Bell scenario (i.e. classical model on H ), at most one of the events a j |x j or a k |x k occurs with probability one, whilst the other must occur with probability zero. Since this relationship holds between any pair of events in the edge e ∈ E(G) ⊇ E(H ), we find that a local deterministic behaviour on H assigns one to at most one of these outcomes. Hence, for each e ∈ E(G)\E(H ), we classical models achievable on H, so any non-classical model on H will necessarily be non-classical on G. Or, to take the contrapositive, if a model is classical on G then it must be classical on H because every deterministic model on G will also be a deterministic model on H. have the following Bell inequality for B(S) (or, equivalently, a KS-noncontextuality inequality on the contextuality scenario H ) a|x∈e p L (a|x) ≤ 1, (51) where p L is a local behaviour (or, equivalently, a classical model on H ). Finally, we show that the Bell inequality of Eq. (51) is saturated by all local behaviours in B(S). We do so by showing that an internal point of the polytope of local behaviours saturates the inequality and therefore the inequality is trivial, in the sense that it is exactly saturated by all models in the affine span of the local polytope. Observe that the uniform behaviour p U (v) = 1/2 n for all v ∈ V (H ) is an internal point of the local polytope and saturates the inequality (51).
Any internal behaviour p I (a|x) of the local polytope may be expressed as a convex combination p I (a|x) = ωp D (a|x) + (1 − ω)p δ (a|x), (52) of any local deterministic behaviour p D (a|x) (a vertex of the local polytope) and some other behaviour on the boundary of the local polytope p δ (a|x), where 0 ≤ ω ≤ 1. Thus, we have and, therefore, by the inequality (51) we find 2 n j=1 p D a j |x j = 1.
Note that any affine combination of local deterministic behaviours, thus any nonsignalling behaviour [55,Corollary 2], also obtains the value one for this Bell expression. In particular, we have 2 n j=1 p L a j |x j = 1, for any local behaviour p L in B(S).
We have shown that any local behaviour in B(S), and thus classical model, p C , on H , satisfies v∈e p C (v) = 1 for all hyperedges e ∈ E(G). Thus, the classical models on H are exactly the classical models on G. It follows that p ρ is a non-classical quantum model on H and therefore violates a Bell inequality.
C Proof of Theorem 5 Theorem 5. There exists a KS set consisting entirely of product rays in any separable Hilbert space H 1 ⊗· · ·⊗H n where dim(H j ) ≥ 3 for some 1 ≤ j ≤ n.
Proof. Let H = H 1 ⊗ . . . ⊗ H n−1 be some separable Hilbert space for which there exists a KS set of bases V k for 1 ≤ k ≤ K consisting entirely of product vectors v l for 1 ≤ l ≤ L (including the case n = 2 where we consider all vectors to be product). Let W = {w j |1 ≤ j ≤ J} be a basis of a separable Hilbert space H n and consider the product bases W k = {w j ⊗ v l |w j ∈ W, v l ∈ V k } of H n ⊗ H for each 1 ≤ k ≤ K. Now assume there is no unentangled KS theorem in H n ⊗ H and, therefore, there exists a KScolouring c of the bases W k . Consider the map c (v l ) = j c(w j ⊗ v l ) on the elements of the bases V k . We will show this map is a KS-colouring of the bases V k and thus the assumption that there is no unentangled KS theorem in H n ⊗ H must be false.
Firstly, we have that c assigns one to at most one of the vectors in {w j ⊗v l |1 ≤ j ≤ J} and assigns zero to the rest, since c is a KScolouring and the vectors are mutually orthogonal. Therefore c (v l ) ∈ {0, 1} for all 1 ≤ l ≤ L. Secondly, if v l , v l = 0 then the vectors {w j ⊗ v l , w j ⊗ v l |1 ≤ j ≤ J} are mutually orthogonal and c assigns one to at most one of the vectors. It follows that c (v l )+c (v l ) ≤ 1. Finally, for all W k we have c(w ⊗ v) = 1 for some w ∈ W and v ∈ V k . Therefore, c (v) = j c(w j ⊗ v) = 1 for some v ∈ V k for all 1 ≤ k ≤ K.
Since there exists a KS set in any separable Hilbert space of dimension at least three, the desired result follows by induction (and the irrelevance of the order of the Hilbert spaces in the tensor product).