Hypergraph framework for irreducible noncontextuality inequalities from logical proofs of the Kochen-Specker theorem

Kochen-Specker (KS) theorem reveals the inconsistency between quantum theory and any putative underlying model of it satisfying the constraint of KS-noncontextuality. A logical proof of the KS theorem is one that relies only on the compatibility relations amongst a set of projectors (a KS set) to witness this inconsistency. These compatibility relations can be represented by a hypergraph, referred to as a contextuality scenario. Here we consider contextuality scenarios that we term KSuncolourable, e.g., those which appear in logical proofs of the KS theorem. We introduce a hypergraph framework to obtain noise-robust witnesses of contextuality from such scenarios. Our approach builds on the results of R. Kunjwal and R. W. Spekkens, Phys. Rev. Lett. 115, 110403 (2015), by providing new insights into the relationship between the structure of a contextuality scenario and the associated noise-robust noncontextuality inequalities that witness contextuality. The present work also forms a necessary counterpart to the framework presented in R. Kunjwal, Quantum 3, 184 (2019), which only applies to KS-colourable contextuality scenarios, i.e., those which do not admit logical proofs of the KS theorem but do admit statistical proofs. We rely on a single hypergraph invariant, defined in R. Kunjwal, Quantum 3, 184 (2019), that appears in our contextuality witnesses, namely, the weighted max-predictability. The present work can also be viewed as a study of this invariant. Significantly, unlike the case of R. Kunjwal, Quantum 3, 184 (2019), none of the graph invariants from the graph-theoretic framework for KS-contextuality due to Cabello, Severini, and Winter (the “CSW framework”, Phys. Rev. Lett. 112, 040401 (2014))

1 Introduction 1 Note that, strictly speaking, the traditional assumption of KS-noncontextuality [1,12] can be applied to arbitrary measurements if one doesn't care to justify outcome determinism from noncontextualityà la Ref. [13], but even so, it fails to make sense as a notion of classicality for nonprojective measurements in quantum theory, e.g., for trivial POVMs. See Section I and Appendix A of Ref. [14] for a discussion of this pathology of KS-noncontextuality as a notion of classicality. Also see Refs. [15,16] for a more in-depth discussion of these issues, particularly the status of Fine's theorem [17] in the case of noncontextuality. Figure 1: A schematic of the prepare-and-measure experiment: the source setting S ∈ S produces a source outcome s ∈ V S and a system prepared according to preparation P [s|S] that is then subjected to the measurement device with measurement setting M ∈ M which then outputs a measurement outcome m ∈ V M . The joint probability of source and measurement outcomes is given by p(m, s|M, S). The source outcome s occurs with probability p(s|S) and setting S prepares the ensemble {P [s|S] , p(s|S)} s∈V S . Time goes up: we assume that future settings/outcomes do not influence past settings/outcomes, i.e., p(m, s|M, S) = p(m|M, S, s)p(s|M, S) = p(m|M, S, s)p(s|S).

Operational theories
We will be concerned with prepare-and-measure experiments in our tests of noncontextuality: that is, we imagine a preparation device as a source of a system that is subjected (following its preparation) to a measurement procedure carried out by a measurement device (see Fig. 1).
The preparation device has many possible source settings S ∈ S, each S specifying a particular ensemble of preparation procedures labelled by (classical) source outcomes s ∈ V S , where V S is the set of source outcomes for source setting S. We call [s|S] a source event. Hence, each preparation procedure is denoted by P [s|S] , corresponding to the source event [s|S]: that is, the system is prepared according to P [s|S] with probability p(s|S) ∈ [0, 1], where s∈V S p(s|S) = 1, for every choice of setting S ∈ S. The ensemble of preparation procedures associated with the source setting S is then given by {P [s|S] , p(s|S)} s∈V S .
Similarly, the measurement device has many possible measurement settings M ∈ M, each M specifying a particular measurement procedure with many possible measurement outcomes m ∈ V M , where V M is the set of values the measurement device can output when the measurement setting is M and a system prepared by some source is an input to the measurement device. We will use [m|M ] to denote the measurement event that the outcome m was witnessed for measurement setting M .
The joint probability of a particular outcome m for measurement setting M and a particular outcome s for source setting S when the input to the measurement device is a system prepared according to procedure P [s|S]  where p(m|M, S, s) is the conditional probability of outcome m for measurement M when the system is prepared according to procedure P [s|S] and p(s|S) is the conditional probability that the system is indeed prepared according to P [s|S] for the source (setting) S.
An operational theory is therefore a specification of the triple (S, M, p), where S is the set of source settings in the operational theory, M is the set of measurement settings, and p : V M × V S × M × S → [0, 1] is a function that specifies the joint probability p(m, s|M, S) that a given source setting S ∈ S and measurement setting M ∈ M produce respective outcomes s ∈ V S and m ∈ V M when the system prepared by the preparation device is fed to the measurement device, and where m,s p(m, s|M, S) = 1 for all M ∈ M, S ∈ S.

Ontological models
An ontological model of an operational theory seeks to provide an explanatory framework for its predictions, grounding them in intrinsic properties of physical systems independent of the operations an agent may implement on the system. All the physical properties of a system are presumed to be encoded in its ontic state λ ∈ Λ, where Λ is the set of all possible ontic states of the system. A source event [s|S] prepares the system in ontic state λ with probability µ(λ|S, s) ∈ [0, 1], where λ µ(λ|S, s) = 1. On measuring the system in ontic state λ, the measurement M produces outcome m with probability ξ(m|M, λ) ∈ [0, 1], where m ξ(m|M, λ) = 1 for all λ ∈ Λ. We then have: p(m|M, S, s) = λ∈Λ ξ(m|M, λ)µ(λ|S, s). (1) We can use Bayes' theorem to write µ(s|S, λ) = µ(s,λ|S) µ (λ|S) . Noting that µ(s, λ|S) = µ(λ|S, s)p(s|S), we have which describes how the operational joint probabilities of the prepare-and-measure experiment must be reproduced by the ontological model. Here ξ(m|M, λ) is the predictive probability that a particular outcome m will occur for a given measurement setting M when the input ontic state is λ while µ(s|S, λ) is the retrodictive probability that a particular outcome s occurred for a given source setting S which produced the ontic state λ. µ(λ|S) is the probability that λ was sampled by the source setting S at all, ignoring its outcomes s ∈ V S .

Operational equivalences and Noncontextuality
The symbol " " denotes coarse-graining over all outcomes, i.e., the [ |S] is the source event that at least one outcome in V S occurred for source setting S. In this paper, we will only make use of such coarse-grained operational equivalences between source settings, that is, ones where we sum over the classical outcomes of the sources.
Note that, because of normalization, any two coarse-grained measurement settings M and M are always operationally equivalent, i.e., m∈V M p(m, s|M, S) = m ∈V M p(m , s|M , S) = p(s|S) for all [s|S]. It's only in the case of sources that the operational equivalence after coarse-graining is nontrivial, i.e., it needs to be verified experimentally.

Context
Any distinction between operationally equivalent experimental procedures -preparations or measurements -is called a context.

Noncontextuality
The assumption of noncontextuality requires operationally equivalent experimental procedures to be represented identically in the ontological model. That is, differences of context between operationally equivalent experimental procedures should be as irrelevant in the ontological model as they are in the operational theory. Indeed, this indifference to variations in context -that is, noncontextuality -in the ontological model is meant to account for the indifference to variations in context -that is, operational equivalence -that holds in the operational theory. Our goal is to put this hypothesis of noncontextuality to experimental test by figuring out operational constraints -noncontextuality inequalities -that it imposes on the operational statistics.
Thus, the assumption of preparation noncontextuality applied to operationally equivalent source events, Applied to the operational equivalence [ |S] [ |S ], preparation noncontextuality reads where We will only make use of this type of preparation noncontextuality in this paper.
The assumption of measurement noncontextuality applied to operationally equivalent measurement events

Contextuality scenarios and probabilistic models on them
In keeping with the definitions of Ref. [24], we introduce the following notions: • Contextuality scenario: A contextuality scenario is a hypergraph H where the nodes of the hypergraph w ∈ W (H) denote measurement outcomes and hyperedges denote measurements f ∈ F (H) ⊆ 2 W (H) such that f ∈F (H) = W (H). We will assume the set of nodes W (H) is finite and, therefore, so is the set of hyperedges F (H).
A node shared between multiple hyperedges represents a measurement outcome with multiple possible measurement contexts in which it can occur. This is the notion of a (measurement) context that is used in logical proofs of the Kochen-Specker theorem relying on KS-uncolourability [1,19]. We will be concerned with this notion of measurement context this paper. 2 • n-hypercycle: An n-hypercycle is a collection of n nodes, appear in a hyperedge of H. Note that we have assumed addition modulo n, i.e., n ⊕ 1 = 1, while labelling the nodes. When the hypergraph H is a graph, i.e., every hyperedge f ∈ F contains exactly two nodes from V , an n-hypercycle will be called an n-cycle.
• Probabilistic model: A probabilistic model on a contextuality scenario is an assignment of probabilities to the nodes of a hypergraph, p : W (H) → [0, 1], such that the hyperedges are normalized, i.e., w∈f p(w) = 1 for all f ∈ F (H). We denote the set of such general probabilistic models on H by G(H).
Viewed operationally, any probabilistic model on the contextuality scenario specifies the probabilities of measurement outcomes when the measurements are implemented on some preparation in an operational theory. A given operational theory may only allow a certain subset of all possible probabilistic models on a contextuality scenario when the measurements in the scenario are implemented on a preparation possible in the operational theory. Indeed, that is the premise of Ref. [24], where possible probabilistic models on a given contextuality scenario are classified as classical, quantum, or general probabilistic models. The full set of possible probabilistic models on the contextuality scenario, corresponding to a polytope, constitutes the set of general probabilistic models.
We will be interested in the polytope of general probabilistic models in this paper. In particular, we do not seek to classify probabilistic models on a contextuality scenarioà la Acin, Fritz, Leverrier, and Sainz (AFLS) [24]. Instead, we will be interested in properties of these probabilistic models that become crucial only in the operational approachà la Spekkens [13], having no analogue in the AFLS framework. This is to be expected since the AFLS framework is a formalization of the Kochen-Specker paradigm and we seek to ask questions that necessitate a reformulation and extension of this paradigmà la Spekkens. For the case of statistical proofs of the Kochen-Specker theorem, this has been achieved in Refs. [10,14]. This paper seeks to achieve this for logical proofs of the Kochen-Specker theorem, based on the ideas conceptualized in Ref. [6]. Our goal here is to provide technical tools concerning the sorts of hypergraph properties -distinct from the ones discussed in, for example, Refs. [23,24] -that are relevant for the noise-robust noncontextuality inequalities we derive in the operational approachà la Spekkens. These hypergraph properties are easily captured in a new hypergraph invariant that we defined in Ref. [14] -the weighted max-predictability β(Γ, q) for a contextuality scenario Γ with hyperedges weighted by probabilities according to the distribution q -and which we will define in due course in this paper. We will use β(Γ, q) to obtain noise-robust noncontextuality inequalities arising from any contextuality scenario Γ yielding a logical proof of the KS theorem.
Note that the probability assigned to a measurement outcome (or node) by any probabilistic model on the hypergraph does not vary with the measurement context (or hyperedge) that the measurement outcome may be considered a part of: operationally, this means that the operational theories that lead to various probabilistic models on contextuality scenarios exhibit nontrivial operational equivalences between measurement outcomes of different measurements. These operational equivalences take the form of the same node being shared between two (or more) hyperedges and are represented in their entirety by the structure of the hypergraph denoting the contextuality scenario. The assumption of measurement noncontextuality -as we have defined it -will be applied to these operational equivalences implicit in the contextuality scenario.
A contextuality scenario which does admit deterministic probabilistic models is called KS-colourable.
• Kochen-Specker (KS) set: A KS set is a set of rank 1 projectors, {Π w } w∈W (H) , on some Hilbert space H that can be associated with the nodes W (H) of a KS-uncolourable scenario such that w∈f Π w = I for all f ∈ F (H) and Π w Π w = δ w,w Π w for any w, w ∈ f . Here I is the identity operator on H.
Each KS set corresponds to an infinity of possible probabilistic models on a KS-uncolourable contextuality scenario, each given by p(w) = TrρΠ w for all w ∈ W (H), for some density operator ρ on H.
• (Induced) Subscenario: Given a contextuality scenario H with nodes W (H) and contexts F (H), the subscenario H S induced by a subset of nodes S ⊆ W (H) is given by: That is, H S is constructed by dropping all the nodes in W (H)\S and restricting the hyperedges in F (H) to their intersection with the nodes in S.
Remark on probabilistic models on H S : Note that an induced subscenario H S admits a valid probabilistic model only if f ∩ S = ∅ for all contexts f ∈ F (H) because otherwise the set of hyperedges in H S will include empty sets which cannot be normalized, rendering a probabilistic model impossible on H S (that is, G(H S ) = ∅). In the language of hypergraph theory, S must be a transversal (or "hitting set") of H.
• Extension of a probabilistic model: Every probabilistic model on H S , say p S , can be extended to a probabilistic model p on H as follows: p(w) = p S (w) ∀w ∈ S and p(w) = 0 otherwise. p is said to be an extension of p S from H S to H.
We now recall Theorem 2.5.3 of Ref. [24], a characterization of extremal probabilistic models on a contextuality scenario, that we will use:

Theorem 1. (Theorem 2.5.3 in [24]) p ∈ G(H) is extremal if and only if it is the extension of a unique probabilistic model p S on an induced subscenario
Hence, there is a one-to-one correspondence between extremal probabilistic models on H and induced subscenarios of H with unique probabilistic models. Indeed, as noted in [24], this means that each extremal probabilistic model p ∈ G(H) is in one-to-one correspondence with the set of vertices assigned nonzero probability by the extremal probabilistic model, i.e. S p ≡ {w ∈ W (H)|p(w) = 0}.
3 A family of KS-uncolourable scenarios: the mapping 2Reg(.) We now consider a particular mapping, which we call 2Reg(.), that often converts a graph to a contextuality scenario that is KS-uncolourable. 3 We will see that many known examples of KS-uncolourable scenarios arise in this way. The mapping 2Reg(.) is defined in the following manner: is defined by a pair of edges in E that share a vertex, i.e., for every The cardinality of W , |W |, is therefore equal to the number of distinct pairs of edges in E such that each pair shares a vertex (in V ). -For every edge e i ∈ E, we define a corresponding hyperedge f i ∈ F such that the nodes in f i correspond precisely to the pairs of edges {{e i , e j }|e i ∩ e j = ∅, j = i and j ∈ {1, . . . , n}}. We have |F | = |E|.
A key property of such a hypergraph, H, generated via 2Reg(G) is that each node w k (corresponding to a pair {e i , e j }, say) in W appears in exactly two hyperedges (f i , f j ∈ F ) of the hypergraph H. That is, 2Reg(G) is a 2-regular hypergraph for any graph G. This is simply because every node in H is essentially defined by the intersection of two hyperedges in H. 4 To see how this mapping works consider an example: the complete bipartite graph K 3,3 with vertices V = {1, 2, 3,1,2,3} and edges E = {(11), (12), (13), (21), (22), (23), (31), (32), (33)} transforms under 2Reg(.) to the CEGA hypergraph [19] which, given a realization with 18 rays in R 4 , provides a proof of the KS theorem in 4 dimensions. See Fig. 2. We will see how this mapping works in a forthcoming section on complete bipartite graphs under 2Reg(.). To prove the converse, we show that |E| is even ⇒ 2Reg(G) is KS-colourable: Note that 2Reg(G) consists of a set of |E| contexts such that every pair of them with a non-empty intersection shares exactly one node. Let's call these contexts C 1 , C 2 , C 3 ,. . . ,C |E| , labelled such that C i and C i+1 (addition modulo |E|, so |E| + 1 = 1) share a node for all i ∈ {1, 2, . . . , |E|}. Consider now the even hypercycle of size |E| given by the contexts

KS-uncolourability under 2Reg(.)
and assign the probability 1 to node defined by the intersection of C 1 and C 2 (denoted C 1 − C 2 ), probability 0 to node defined by the intersection of C 2 and C 3 (denoted C 2 − C 3 ), 1 to node defined by intersection of C 3 and C 4 (denoted C 3 − C 4 ),. . . , and so on, alternating assignments of 1 and 0, up to assigning probability 0 to the node denoted C |E| − C 1 . 5 The induced subscenario consisting of singleton hyperedges (that is, hyperedges with a single node each) then admits a unique probabilistic model which extends to a deterministic extremal probabilistic model on 2Reg(G). This is easy to see because the induced subscenario assigns probability 1 to |E| 2 nodes and each of those nodes appears in two contexts, thus ensuring that all the contexts are properly normalized in the extension of the unique probabilistic model to 2Reg(G). Hence, 2Reg(G) is KS-colourable whenever |E| is even.

From graphs to hypergraphs under 2Reg(.)
We will now prove some facts about the behaviour of some special classes of graphs under 2Reg(.).

Lemma 1. All n-cycle (n ≥ 3) graphs are invariant under 2Reg(.).
Proof. Given the n-cycle graph the hypergraph under 2Reg(.) is given by n1 ), w 2 ≡ (e 12 , e 23 )}, 5 Note that "C i − C i+1 " denotes the node that appears in both contexts, C i and C i+1 .     Proof. This follows from Theorem 2 since mn is the number of contexts in 2Reg(K m,n ).
Examples of known KS-uncolourable contextuality scenarios that are of the type 2Reg(K m,n ): 1. The 3-hypercycle (or "triangle") contextuality scenario from K 1,3 : 2Reg(K 1,3 ). This is the simplest KSuncolourable scenario. It does not admit a KS set. See Fig. 4 2. 2Reg(K 1,5 ) in Ref. [25]. No KS set has been found for this scenario. The assignments in Ref. [25] are subnormalized (that is, the projectors in a basis do not add up to identity) and do not satisfy the definition of a KS set. See

Theorem 3. 2Reg(G) is a 3-hypercycle if and only if
Proof. The proof is just by explicitly exhausting all possible 3-edge graphs and verifying whether they lead to a 3-hypercycle. For 2Reg(G) to be a 3-hypercycle, G must have three edges, say {e 1 , e 2 , e 3 }. Further, since each node 2Reg(G) is defined by a pair of edges in {e 1 , e 2 , e 3 }, and since there are only three distinct pairs of edges in {e 1 , e 2 , e 3 }, it must be the case that the intersection of each pair of edges corresponds to a vertex in G. However, not every vertex in G needs to be in the intersection of two edges.
A vertex in G can be of degree 1, 2, or 3. The fact that every pair of edges in G must have a non-empty intersection means that there exists at least one vertex of degree 2 in G given by e 1 ∩ e 2 or e 2 ∩ e 3 or e 3 ∩ e 1 . Let us denote the edges sharing this vertex by {e i , e j } (i = j ∈ {1, 2, 3}), so that the vertex is e i ∩ e j . We thus know that G has at least 3 vertices. The only option for the remaining edge, denoted e k (k = i, j ∈ {1, 2, 3}), in Gfor 2Reg(G) to be a 3-hypercyle -is then to attach to the vertex e i ∩ e j , thus producing a claw graph K 1,3 with 4 vertices and 3 edges, or to attach to the two degree 1 vertices of edges e i , e j , thus forming a 3-cycle or triangle graph.

Corollary 1. The mapping 2Reg(.) is not invertible.
Proof. This follows from Theorem 3. There does not exist an inverse mapping that, when applied to 2Reg(G) would yield G: in general, 2Reg(.) is a many-to-one mapping, hence non-invertible. The example of Theorem 3 illustrates this.
Proof. Since 2Reg(G) is a k-hypercycle, G must have k edges, say {e 1 , e 2 , . . . , e k }. Since the nodes of 2Reg(G) are defined by pairs of edges in G with non-empty intersection, there must be k such pairs in {e 1 , e 2 , . . . , e k }. The intersection of each such pair is a vertex of G, hence G has at least k vertices. The degree of any vertex of G cannot be more than 2: for any vertex of degree 3 or more in G one would have the presence of a 3-hypercycle in 2Reg(G) (which could not then be a k-hypercycle, k ≥ 5) from Theorem 3. Hence, G is a graph with k edges and at least k vertices such that k pairs of edges have a non-empty intersection and each vertex of G is of degree no more than 2. These constraints fix G to be a k-cycle: any change to the k-cycle can only be done by adding vertices of degree 0, but no edges can be added. We can safely ignore such superfluous vertices (of degree 0) since they do not change anything about 2Reg(G).
Combining the above with Lemma 1, we have our result.

Note on matching scenarios
The contextuality scenario 2Reg(G) obtained from a graph G can be viewed as a matching scenario of another graph L(G), the line graph of G, in the sense of Ref. [24]. The line graph L(G) of a graph G is obtained by representing each edge of G as a vertex of L(G) and for each pair of edges in G that have a non-empty intersection we connect the corresponding vertices in L(G) by an edge.
Following Ref. [24], we then have: A matching scenario Mat(L(G)) is constructed as follows: every edge of L(G) is represented by a vertex of Mat(L(G)) and each hyperedge of Mat(L(G)) contains those vertices of Mat(L(G)) which represent edges of L(G) that have a non-empty intersection. We leave it as an exercise for the reader to verify that the composition of the two mappings L(.) and Mat(.) applied to G in that order as Mat • L(G) is the same as the mapping 2Reg(G) for any G.
Hence, 2Reg(G) is a matching scenario of L(G). Matching scenarios were discussed in [24]. Indeed, the following lemma shows how matching scenarios obtained corresponding to complete graphs discussed in Ref. [24] arise from bipartite graphs under the mapping 2Reg(.).
is therefore the matching scenario Mat n of AFLS [24].
We refer the reader to Sec. 9.4 of Ref. [24] for further details on matching scenarios. We mention the connection to matching scenarios of L(G) here only for completeness and, as such, we will not discuss them hereafter. For our purposes, we want to infer properties of 2Reg(G) directly from G, instead of considering an intermediate graph L(G), since G has a much more compact representation than L(G) and the edge-hyperedge correspondence between G and 2Reg(G) can be exploited to understand the structure of 2Reg(G) and possible probabilistic models on it. This in turn lets us obtain our noncontextuality inequalities for such scenarios.
In the next section we define some parameters to characterize arbitrary contextuality scenarios in a systematic manner before we present our general approach to obtaining noise-robust noncontextuality inequalities from KSuncolourable scenarios.

Uniform contextuality scenarios, their parameters, and KS-uncolourability
Before we can talk about noise-robust noncontextuality inequalities obtained from KS-uncolourable contextuality scenarios, we need a way to detect KS-uncolourability of a given scenario. We now define some parameters that we will use in our discussion of KS-uncolourability and then provide some (sufficient) conditions for KS-uncolourability of a contextuality scenario. In the process, we also answer some open questions that were posed in Ref. [27]. The conditions we obtain are enough for the contextuality scenarios we consider in this paper but the question of identifying conditions that are both necessary and sufficient for KS-uncolourability of a contextuality scenario remains open.
Consider a contextuality scenario, H = (W, F ), where W is its set of nodes and F its set of hyperedges. Further, the scenario is such that there are d nodes in each hyperedge (that is, the hypergraph is d-uniform), |F | hyperedges, and |W | nodes in all. Let m be the number of nodes that appear in more than one hyperedge. We then have the following relations: More precisely, where n k is the number of nodes which each appear in k distinct hyperedges (i.e., there exist n k nodes of degree k) and D is the largest number of distinct hyperedges any node can appear in (i.e., there exists a node which appears in D distinct hyperedges but no node that appears in D + 1 distinct hyperedges or, equivalently, D is the degree of a node of maximum degree in the hypergraph). Note that We therefore have for any contextuality scenario. In Ref. [27], an algorithmic way to enumerate KS-uncolourable hypergraphs which admit KS sets was presented. The observations noted in Ref. [27] as a result of this algorithmic enumeration led to some conjectures, some of which we now answer by simply noting constraints between the parameters defined above.
An open question in Sec. 5(i) of Ref. [27] was: Is it true for arbitrary d that m ≤ d|F | 2 (for KS-uncolourable hypergraphs which admit KS sets)? We have just answered this question in the affirmative (see Eq. (14)) for all d-uniform contextuality scenarios, not just those which are KS-uncolourable and admit KS sets. Further, that is, if every node in a contextuality scenario appears in at least 2 distinct hyperedges, then m = |W | ≤ d|F | 2 . More precisely, This characterizes all the contextuality scenarios for which |W | ≤ d|F | 2 . Hence, for any n 1 > 0, we must have enough nodes with degree greater than 2 for the relation |W | ≤ d|F | That is, roughly speaking, the inequality |W | ≤ d|F | 2 ceases to hold when there are way too many nodes of degree 1 than there are nodes of degree 3 or higher.
Indeed, the only known exception to |W | ≤ d|F | 2 that Ref. [27] finds is the construction due to Kochen and Specker [1] (henceforth called the "KS67 construction") with |W | = 192, d = 3, |F | = 118, so that d|F | 2 = 177 and we have |W | > d|F | 2 . However, it is still the case that m = 117 < d|F | 2 = 177 (strict inequality because some of the nodes appear in more than 2 hyperedges). Note that in this case n 1 = 75 which is way greater than n 3 + 7n 9 = 45.

Conditions on the parameters of a contextuality scenario for its KS-uncolourability
n k , and m = |W | − n 1 for any d-uniform contextuality scenario. We want to find conditions that rule out the existence of a {0, 1}-valued solution for the system of |F | normalization equations in |W | variables with d variables in each equation.

2-regular contextuality scenarios
From Theorem 2 we know that any 2-regular contextuality scenario is KS-uncolourable if and only if it has an odd number of contexts.

d-uniform and 2-regular contextuality scenarios of type 2Reg(G)
For d-uniform and 2-regular contextuality scenarios (every node of degree 2) we have m = |W | = d|F | 2 . There are |F | normalization equations in d|F | 2 variables, each variable appearing in two equations. For any graph G, 2Reg(G) is a 2-regular contextuality scenario and 2Reg(G) is KS-uncolourable if and only if |F | is odd (Theorem 2). If we further require 2Reg(G) to be a d-uniform contextuality scenario, then d must be even for any 2Reg(G) if |F | is odd. 7 Hence, KS-uncolourability holds for those (and only those) d-uniform 2Reg(G) contextuality scenarios which have even d and odd |F |: any KS set satisfying such KS-uncolourability can therefore only be constructed on an even-dimensional Hilbert space.
to be an integer.
All proofs of the Kochen-Specker theorem relying on d-uniform contextuality scenarios of type 2Reg(G) must therefore require an even-dimensional Hilbert space. In other words, it is impossible to realize KS sets for these scenarios in odd dimensions.

d-uniform contextuality scenarios
For more general d-uniform hypergraphs -that is, not necessarily restricted to those of the type 2Reg(G) -we have: |F | equations with |W | variables, n k of them appearing in exactly k equations (for all k ∈ {1, 2, . . . , D}), and each equation a sum of d variables adding up to 1. Such a hypergraph is said to be KS-uncolourable when these equations do not admit a {0, 1}-valued solution. We now give some sufficient conditions for KS-uncolourability of these hypergraphs.
Let us denote by W k the set of nodes such that each node in the set appears in k equations (contexts), i.e., The cardinality of the set W k is n k . On adding up the |F | equations, we have: where p(w , is an even number if and only if an even number of nodes of each odd degree k (such that n k > 0) are assigned the value 1, i.e., is an even number for all odd k with n k > 0.
Proof. This should be clear from noting that The even k part of the sum is even simply because all the terms in the sum over k are even. The odd k part of the sum over k is even if and only if in each term is an even number for all odd k with n k > 0.

Lemma 6. A KS contradiction arises if one of the following holds:
1. |F | is odd and is an even number for all odd k with n k > 0.
2. |F | is even and there exists an odd k with n k > 0 such that Proof. This is trivially the case because an even number = an odd number.
On adding up the 118 normalization constraints on KS67, we have: We know that one of the normalization equations is hence there exists an odd k = 9 (with n k = n 9 = 3 > 0) such that 3 j=1 p(w (9) j ) = 1, an odd number. Since |F | = 118 is even, we have a KS contradiction for this hypergraph from Lemma 6.

Contextuality scenarios and extremal probabilistic models on them
Once, we know that a contextuality scenario is KS-uncolourable, what can we say about the structure of extremal probabilistic models on it? If the extremal probabilistic models on a contextuality scenario admit a "nice" characterization, then this can be used in obtaining noise-robust noncontextuality inequalities. In this section, we consider contextuality scenarios of type 2Reg(.) and extremal probabilistic models on them. Later we will use this characterization for obtaining noise-robust noncontextuality inequalities.
Note that since 2Reg(G) are matching scenarios in the sense of Ref. [24], our characterization of extremal probabilistic models presented below is already known from graph-theoretic methods used in Ref. [24]. All we have done below is to provide an alternative self-contained proof that proceeds directly from Theorem 1 instead of relying on known results from graph theory. This is partly motivated by a need to explore in more generality (than done in Ref. [24]) the consequences of Theorem 1, many of which we will later use in obtaining noise-robust noncontextuality inequalities. Theorem 6. Each extremal probabilistic model on 2Reg(G) for any graph G is an extension of the unique probabilistic model on an induced subscenario consisting of some set of disjoint odd k-hypercycles, k ∈ {3, 5, . . . }, and/or some set of singleton contexts. Hence, each extremal probabilistic model on 2Reg(G) assigns probabilities from {0, 1 2 , 1} to the nodes of 2Reg(G). Proof. We know that each extremal probabilistic model p on H ≡ 2Reg(G) is in one-to-one correspondence with the set of nodes to which it assigns nonzero probability: S p ≡ {w ∈ W (H)|p(w) = 0}.
For any graph G, every node in H(= 2Reg(G)) appears in two contexts. Hence, in any induced subscenario H Sp defined by a subset of nodes of H given by S p (for extremal probabilistic model p on H), none of the nodes can appear in more than two contexts and none of them can be assigned probability zero. This leaves two possibilities for nodes in H Sp : either a node appears in one context or it appears in two contexts.
Let us denote by q Sp the restriction of extremal probabilistic model p on H to the unique probabilistic model q Sp on H Sp : q Sp (w) = p(w) > 0 for all w ∈ S p (and p(w) = 0 for all w ∈ W \S p ).
Consider the set of nodes S (1) p ⊆ S p such that each of these nodes appears in exactly one context in H Sp and the subhypergraph H 1 obtained from H Sp by deleting all nodes in S p \S p . That is, H 1 is a union of (disjoint) singleton contexts. Now consider the remaining set of nodes S p ) such that each node appears in exactly two contexts in H Sp and the subhypergraph H 2 obtained from H Sp by deleting all nodes in S We will now show that H 2 is a union of disjoint odd k-hypercycles, k ≥ 3: for H 2 , the number of nodes of degree 2 is n 2 = |S (2) p |, and the restriction of extremal probabilistic model p > 0 on H to nodes in H 2 , defined by q S (2) p (w) = p(w) for all w ∈ S (2) p , is also extremal on H 2 (otherwise p > 0 on H can't be extremal). Now, the existence of a probabilistic model q S (2) which means that f ∈F (H2) d(f ) = 2|F (H 2 )|. We therefore have: |F (H 2 )| = |W (H 2 )| and d(f ) = 2 for all f ∈ F (H 2 ). This gives us the following characterization of H 2 : This follows from noting that, firstly, H 2 is a 2-uniform hypergraph (that is, d(f ) = 2 for all f ∈ F (H 2 )), hence really a graph. Secondly, a k-cycle (k ≥ 3) is defined as a connected graph where each vertex is of degree 2 and the number of edges is equal to the number of vertices. Hence, a graph where each vertex is of degree 2 and the number of edges is equal to the number of vertices is a union of disjoint k-cycles. H 2 is just such a (hyper)graph.
Together with the fact that H 2 admits the unique probabilistic model q S (2) p > 0, this means that the k-hypercycles in the disjoint union have, in fact, odd k ≥ 3. This is because even k-hypercycles admit nonunique deterministic extremal probabilistic models which allow for assignment of probability 0 to some nodes. That is, , and H 2 admits a unique probabilistic model ⇔ H 2 is a union of disjoint odd k-hypercycles (k ≥ 3).
This in turn implies that the unique probabilistic model q S (2) p > 0 is in fact given by q S (2) p (w) = 1 2 for all w ∈ S (2) p . Hence: every extremal probabilistic model p on H = 2Reg(G) is given by the induced subscenario H Sp consisting of a disjoint union of H 1 and H 2 , the former a set of singleton contexts and the latter a union of disjoint odd khypercycles. Overall, p(w) = 1 for all w ∈ S (1) p , and p(w) = 0 for all w ∈ W \S p (where Combining Theorems 3, 4, and 6, we have that we only need to look for the presence of K 1,3 and k-cycles (k ≥ 3) in any graph G in order to ascertain all the extremal probabilistic models on the scenario 2Reg(G). Proof. Taking the complement of K m,n , once subgraph K 1,k or K k,1 is removed, we denote the resulting graph as G(K m,n \K 1,k ) or G(K m,n \K k,1 ), respectively. Since both G(K m,n \K 1,k ) and G(K m,n \K k,1 ) have mn − k (an even number) of edges, from Theorem 2 we have that 2Reg(G(K m,n \K 1,k )) and 2Reg(G(K m,n \K k,1 )) are KScolourable and therefore admit extremal probabilistic models induced entirely by singletons. 8 Now consider an induced subscenario of 2Reg(G(K m,n \K 1,k )) or 2Reg(G(K m,n \K k,1 )) that consists entirely of singletons so that the number of hyperedges in 2Reg(G(K m,n \K 1,k )) or 2Reg(G(K m,n \K k,1 )) is twice the number of singletons in this induced subscenario. 9 Extending this induced subscenario by adding a k-hypercycle (disjoint from the subscenario) leads to an induced subscenario of 2Reg(K m,n ), namely, one that is a union of a k-hypercycle with the singletons. (See Fig. 7 for an illustration in the case of K 3,3 .)

Noise-robust noncontextuality inequalities from a hypergraph invariant
We are finally in a position to use the understanding developed so far to obtain noise-robust noncontextuality inequalities for KS-uncolourable contextuality scenarios. The noise-robust noncontextuality inequalities reported in Ref. [6] and the more fine-grained ones for the case of the 18 ray scenario [19] reported in Ref. [28] will be seen to be special cases of our inequalities. We begin with an outline of the general framework within which our noise-robust noncontextuality inequalities will be obtained. In particular, we will introduce a hypergraph invariant and make precise the role that it plays in our inequalities. This framework is applicable to any KS-uncolourable contextuality scenario and we will study some well-known examples of such scenarios. This is in contrast to the framework of Ref. [14] which is only applicable to contextuality scenarios that are KS-colourable and also satisfy the property that all probabilistic models on them obey consistent exclusivityà la AFLS [24].

Operational equivalences
In keeping with the treatment in Ref. [6], we associate with each KS-uncolourable scenario two kinds of hypergraphs: one corresponding to the operational equivalences presumed between measurement events and another corresponding to operational equivalences presumed between source events. 10 Assuming the contextuality scenario consists of n (measurement) contexts with d nodes each, we consider n measurement settings M i , i ∈ {1, 2, . . . , n} ≡ [n], each with d possible outcomes, m i ∈ {1, 2, . . . , d} ≡ [d]. We We also consider source settings S i , i ∈ [n], each with d possible outcomes s i ∈ [d], such that the following operational equivalences hold among the sources: [ where In terms of the joint probabilities p(m, s i |M, S i ), these operational equivalences read: Given that p(m, s|M, S) = λ∈Λ ξ(m|M, λ)µ(s|S, λ)µ(λ|S), the assumption of preparation noncontextuality then says: Here,  Figure 8: Operational equivalences between measurements are depicted on the left and the operational equivalences between sources are depicted on the right.
We recall the measurement events and source events hypergraphs for the example of Ref. [6] in Fig. 8, where n = 9 and d = 4. The noncontextuality inequality of Ref. [6] that follows from the operational equivalences for sources and measurements then reads: where is the average max-probability for a given λ ∈ Λ. Hence, the noncontextuality inequality bounds A by the maximum average max-probability of the measurements M i possible in any ontological model. Noting that p(s i = m i |S i ) = 1 4 for all m i ∈ [4] for the scenario of Ref. [6], we can rewrite the quantity A as: In the general case of n measurement procedures with d outcomes each, the expression for A reads Furthermore, we need not even restrict ourselves to a uniform average of the source-measurement correlation over the source and measurement settings and allow instead a weighted average given by some probability distribution q ≡ {q i } n i=1 , where q i ≥ 0 for all i and i q i = 1. The quantity A then becomes the source-measurement correlation quantity Corr that was previously defined in Ref. [14] and can be upper bounded as follows (following Ref. [6]): ≡ β(Γ, q) < 1 (for some choices of q) (using measurement noncontextuality and KS-uncolourability).
Here, β(Γ, q) is the weighted max-predictability for a contextuality scenario Γ, first defined in Ref. [14] as where Λ ind is the set of ontic states which all assign probabilities valued in the interval (0, 1) to at least one measurement context in the contextuality scenario. In this paper, the contextuality scenario Γ is KS-uncolourable, hence the qualifier that the maximization in the definition of β(Γ, q) is taken over only the ontic states that make indeterministic assignments of probabilities to the measurement events (that is, Λ ind ) is unnecessary: all ontic states assigning probabilities to the measurement events of a KS-uncolourable Γ must necessarily assign some probabilities valued in (0, 1) (i.e., Λ = Λ ind ). Note also that β(Γ, q) need not always be strictly less than 1 for KS-uncolourable Γ: this can happen, for example, if q is supported only on those contexts (if they exist) which are all assigned {0, 1}valued probabilities (i.e., they are deterministic) by some extremal probabilistic model on Γ. On the other hand, we know that there always exists a choice of q such that β(Γ, q) < 1 simply because of the KS-uncolourability of Γ, e.g., any choice where q is supported on all the contexts of Γ -such as q being a uniform probability distribution over the measurement contexts -so that there is no extremal probabilistic model that makes all the contexts deterministic.
In the following sections, we will show how bounds on Corr when q is supported on certain (sub)sets of contexts (which we will call minimally indeterministic sets of contexts, or MISCs, below) can be obtained from conceptual arguments instead of a brute-force computational approach. We will show that for such MISCs (instead of all the contexts in a contextuality scenario), we can obtain noncontextuality inequalities of the type: where c (< n) is the number of contexts in a MISC, each context denoting a measurement setting M ri , where q ri > 0 for all i ∈ {1, 2, . . . , c}, c i=1 q ri = 1, and r i ∈ {1, 2, . . . , n} are all distinct.
If the contextuality scenario Γ were KS-colourable, then we would have max λ∈Λ n i=1 q i ζ(M i , λ) = 1 for any choice of q (corresponding to any set of c contexts); however, since Γ is KS-uncolourable, there necessarily exist one or more sets of c contexts (for some c) such that max λ∈Λ n i=1 q i ζ(M i , λ) < 1 when q is supported on such sets. Note that while in the former case β(Γ, q) is undefined, in the latter case we have β(Γ, q) = max λ∈Λ n i=1 q i ζ(M i , λ). The MISCs we define below are examples of such sets of c contexts for which β(Γ, q) < 1. Finding a MISC and computing its β(Γ, q) value yields a noise-robust noncontextuality inequality in our framework.
We will be interested in finding all the irreducible MISCs (or as we define them later, "irrMISCs") in a KSuncolourable contextuality scenario: finding them and evaluating their β(Γ, q) values amounts to identifying a minimal set of independent noncontextuality inequalities for that scenario; from these, all the other MISC inequalities can be obtained by coarse-graining.

Minimally Indeterministic Sets of Contexts (MISCs)
We now consider assignments of probabilistic models to a contextuality scenario specified by an ontic state λ ∈ Λ according to the response functions ξ(m i |M i , λ) ∈ [0, 1]. A deterministic context is one where all the measurement outcomes are assigned {0, 1}-valued probabilities, i.e., ξ(m i |M i , λ) ∈ {0, 1} for all m i , M i . An indeterministic context is one which is not deterministic, i.e., it only allows probability assignments in [0, 1) to the measurement outcomes. The max-probability for a deterministic context is 1 while for an indeterministic context it is less than 1.

Minimally Indeterministic Set of Contexts (MISC) of size c:
A set of c contexts such that no more than c − 1 of them can be made deterministic by any (extremal) probabilistic model on the contextuality scenario, i.e., β(Γ, q) < 1 when q is supported entirely on such a set of c contexts.

Noise-robust noncontextuality inequalities for any KS-uncolourable contextuality scenario
Simple noncontextuality inequalities can be obtained from a KS-uncolourable contextuality scenario by identifying the following type of MISCs: For a KS-uncolourable contextuality scenario (with, say, n contexts), every extremal probabilistic model will make some of the contexts indeterministic. Let k be the smallest number of such indeterministic contexts present in any extremal probabilistic model on the KS-uncolourable contextuality scenario. Then any set of n − k + 1 contexts (out of all the n) constitutes a MISC, i.e., β(Γ, q) < 1 when q is supported entirely over this set of contexts.
From Theorem 1, for a KS-uncolourable contextuality scenario, every extremal probabilistic model is in oneto-one correspondence with an induced subscenario admitting a unique probabilistic model. KS-uncolourability means that any induced subscenario with a unique probabilistic model would necessarily contain hyperedges that are non-singleton (i.e., containing more than one node) with their nodes assigned probabilities less than 1. Ignoring the singleton hyperedges in such an induced subscenario (i.e., those containing exactly one node), all the remaining hyperedges are indeterministic. We refer to the subscenario consisting of these remaining (indeterministic) hyperedges and the nodes they contain as an induced indeterministic subscenario. k is then the number of contexts in the smallest (in terms of the number of contexts) induced indeterministic subscenario obtained from an induced subscenario with a unique probabilistic model. Now note that the hypergraph with the least number of contexts (and containing no singleton contexts) admitting a unique probabilistic model is a 3-hypercycle. Hence, a 3-hypercycle is the smallest induced indeterministic subscenario possible and we have Sufficient condition for a set of contexts to be a MISC: For any KS-uncolourable contextuality scenario it will be the case that k ≥ 3 and any set of n − 2 contexts in the scenario will form a MISC, i.e., β(Γ, q) < 1 when q is supported on any set of n − 2 contexts.
For an example, see Fig. 7(c), second column, for an induced indeterministic subscenario of the 18 ray hypergraph [19] and the third column for the induced subscenario of which the induced indeterministic subscenario is a part.
Given that k is the size of the smallest induced indeterministic subscenario, we have a noncontextuality inequality whenever q is supported on any set of n − k + 1 contexts (which constitute a MISC): If we take q ri = 1 n−k+1 for all i ∈ {1, 2, . . . , n − k + 1}, we have the following noncontextuality inequality for a MISC consisting of n − k + 1 contexts: where p max ∈ 1 d , 1 is the largest max-probability associated with any indeterministic context included in the MISC. This max-probability corresponds to an extremal probabilistic model that makes all but one of the contexts in the MISC deterministic. In the case of the 18 ray scenario, for example, k = 3 and any set of n − k + 1 = 9 − 3 + 1 = 7 contexts forms a MISC and we have p max = 1 2 , so that the upper bound in the above inequality given by 13 14 . (See Fig. 7, third column: the six deterministic contexts together with any one of the three indeterministic contexts form such a seven-context MISC.)

Sufficient condition for a set of contexts to be a MISC is not necessary
While the sufficient condition outlined above for a set of contexts in a contextuality scenario to be a MISC works for any KS-uncolourable contextuality scenario and yields noncontextuality inequalities, it is not a necessary condition. It is possible to identify smaller MISCs depending on the particular contextuality scenario and the probabilistic models on it.
Consider, for example, all the scenarios of the type 2Reg(G) that we have discussed. In these scenarios, each node appears in two contexts and therefore deterministic contexts appear in pairs in any extremal probabilistic model on these scenarios: this is because the deterministic contexts in any extremal probabilistic model are determined by singleton hyperedges in the induced subscenario and the node in a singleton hyperedge (assigned probability 1) appears in two contexts in the full contextuality scenario (see Fig. 7, third column, for example). It then becomes possible to reduce the MISCs of size n − k + 1 that we have identified above to MISCs of size n−k 2 + 1 simply by taking the given MISC and omitting one of each pair of deterministic contexts that share a node in the given MISC. For example, see Fig. 7(c), third column, where three deterministic contexts together with an indeterministic context form a four-context MISC.
Since a MISC may thus contain smaller MISCs, we define the notion of an "irreducible MISC":

Irreducible MISC (irrMISC): A MISC which does not contain another MISC as a proper subset.
Therefore, an irrMISC is such that for its every proper subset there exists an extremal probabilistic model in which this proper subset is deterministic. As we noted, the MISCs of size n − k + 1 we have identified above can be reduced to MISCs of size n−k 2 + 1 in 2Reg(.) scenarios. Are these MISCs of size n−k 2 + 1 irreducible? Not necessarily.
In general, it is possible to identify proper subsets of MISCs which are irreducible MISCs. Let us see how this plays out for some scenarios we will consider in detail here: 2Reg(K 3,3 ), 2Reg (K 1,7 ), and the general case of 2Reg(K 1,n ) (odd n > 1). For concreteness, we will assume hereon that q is a uniform distribution over all the contexts in a MISC, although our identification of MISCs does not rely on this choice.
After illustrating the underlying ideas via these explicit examples, we will conclude with a general theorem characterizing irrMISCs in contextuality scenarios of type 2Reg(K m,n ) (with odd mn > 1).

2Reg(K 3,3 )
Denoting the edges of The noncontextuality inequality corresponding to each 7-context MISC (MISC j (7), j = 1, 2, 3) is given by More generally, from the fact that the contextuality scenario admits 3-hypercycles, we have that the average predictability is constrained for any set of c ≥ 7 contexts (out of 9) since at most 6 of them can be made deterministic but not the remaining ones by any extremal probabilistic model. Then for any choice of c contexts such that c = 7, 8, 9, we have noncontextuality inequalities with Corr MISCj (c) constrained by 13/14, 7/8, and 5/6 respectively.
In all, there are 9 such MISC(4) and they are irreducible, i.e., no proper subset of these 9 MISCs forms a MISC. This is easy to verify, for example, for the MISC(4) {(31), (21), (12), (13)}: every proper subset of this MISC(4) appears in one of the six deterministic sets of contexts. These irrMISCs are depicted in Fig. 9 and listed below: Each of these irrMISC(4)s (with uniform q) corresponds to a noncontextuality inequality: Are there any still smaller MISCs, say MISC(3), in this contextuality scenario? Indeed, such MISCs exist and they correspond precisely to the perfect matchings of the graph K 3,3 . Each of the six vertices of K 3,3 is an origin of a 3-hypercycle (corresponding to 2Reg(K 1,3 ); see Fig. 13) in the contextuality scenario 2Reg (K 3,3 ). Hence, a perfect matching -namely, a set of disjoint edges such that they cover all the six vertices of the graph -ensures that the three hyperedges corresponding to these edges in the perfect matching cannot all be made deterministic by any 3-hypercycle extremal probabilistic model on 2Reg(K 3,3 ). This is because at least one of the three hyperedges, e.g. {(11), (22), (33)}, will be indeterministic (forming a part of a 3-hypercycle) in these extremal probabilistic models. Indeed, these three hyperedges can't be made deterministic by any extremal probabilistic model at all, since all extremal probabilistic models on 2Reg(K 3,3 ) are induced by odd hypercycles and the remaining extremal probabilistic models must therefore contain at least a 5-hypercycle. A 5-hypercycle extremal probabilistic model would make 4 contexts deterministic, but these 4 contexts will come in pairs that each share a deterministic node assigned probability 1. This means a maximum of 2 independent deterministic contexts in any other extremal probabilistic models besides those induced by 3-hypercycles: hence these extremal probabilistic models cannot make more than two of the three contexts in a perfect matching deterministic (since the three contexts share no nodes in 2Reg (K 3,3 )). There are six perfect matchings of K 3,3 , hence 6 instances of MISC(3), all of which are in See Fig. 10. Each of these irrMISC(3)s yields a noncontextuality inequality (again, assuming uniform q here): The noncontextuality inequality of Ref. [6] can then be obtained by coarse-graining these irrMISC (3) Note that the upper bounds on the average correlation corresponding to irrMISC(3)s and irrMISC(4)s were first obtained in Ref. [28] via an implementation of Fourier-Motzkin elimination. 11 On the other hand, our derivation hinges on a conceptual insight -based on the mapping 2Reg(.) and Theorem 1 -that clarifies why we expect the average source-measurement correlation of particular sets of contexts (rather than arbitrary sets of contexts) in these noncontextuality inequalities to be bounded away from 1. It boils down to identifying MISCs and irrMISCs in a contextuality scenario. Indeed, as we now show, our understanding lets us obtain previously undiscovered noncontextuality inequalities in other KS-uncolourable contextuality scenarios.
Since every edge is connected to every other edge in K 1,7 and vertex 1 is the origin of all hypercycles, we have that each choice of a set of 3 contexts in 2Reg(K 1,7 ) will form a 3-hypercycle. Extremal probabilistic models induced by subscenarios containing these 3-hypercycles can make at most all the remaining 4 contexts deterministic. Indeed, taking out 3 edges from K 1,7 yields K 1,4 as a remnant and 2Reg(K 1,4 ) admits only deterministic extremal probabilistic models.
Each MISC of size c would require a set of c contexts such that no more than c − 1 of them can be made deterministic in any extremal probabilistic model on 2Reg(K 1,7 ). We know that every choice of a set of 4 contexts in 2Reg(K 1,7 ) can be made deterministic by some extremal probabilistic model since every such choice is in oneto-one correspondence with a choice of a 3-hypercycle (consisting of the remaining 3 contexts) inducing such an extremal probabilistic model on 2Reg(K 1,7 ): we have 7 C 4 = 7 C 3 = 35 such choices. Hence a set of contexts of size ≤ 4 can never form a MISC: there will always exist an extremal probabilistic model which will make all of the contexts in the set deterministic. All irrMISCs are therefore of size c = 5 in 2Reg(K 1,7 ) and every set of 5 contexts (n − k + 1 = 7 − 3 + 1 = 5) forms an irrMISC (5). See Fig. 11.
Since every edge is connected to every other edge in K 1,n , each triple of contexts in 2Reg(K 1,n ) will form a 3-hypercycle. Extremal probabilistic models induced by subscenarios containing these 3-hypercycles can at most make the remaining n − 3 contexts deterministic. Indeed, taking out 3 edges from K 1,n yields K 1,n−3 as a remnant and 2Reg(K 1,n−3 ) does admit deterministic extremal probabilistic models (from Theorem 2, since n − 3 is even for any odd n > 1.) Each MISC of size c would require a set of c contexts such that no more than c − 1 of them can be made deterministic in any extremal probabilistic model on 2Reg(K 1,n ). We know that every choice of a set of n − 3 contexts in 2Reg(K 1,n ) can be made deterministic by some extremal probabilistic model since every such choice is in one-to-one correspondence with a choice of a 3-hypercycle (consisting of the remaining 3 contexts) inducing such an extremal probabilistic model on 2Reg(K 1,n ): n C n−3 = n C 3 = n! 3!(n−3)! . Hence a set of contexts of size ≤ n − 3 can never form a MISC: there will always exist an extremal probabilistic model which will make all of the contexts in the set deterministic. All irrMISCs are therefore of size n − 2 in 2Reg(K 1,n ) and every set of n − 2 contexts forms an irrMISC(n − 2). Clearly, the sufficient condition for a set of contexts to be a MISC that we identified in Sec. 6.2.1 is also necessary for contextuality scenarios of the type 2Reg(K 1,n ) for odd n ≥ 3.
3. All the contexts (or each irrMISC(n − 2) and 2 indeterministic contexts): There is one such inequality. 12 6.2.6 2Reg(K m,n ), for odd mn > 1 We now extend the derivation of irrMISC noncontextuality inequalities above to the case of all KS-uncolourable 2-regular scenarios, 2Reg(K m,n ), obtained from arbitrary complete bipartite graphs K m,n . These are just those K m,n with odd mn > 1 (from Theorem 2): K 3,3 and K 1,n (odd n > 1) are special cases of these, so the recipe for noncontextuality inequalities obtained here will recover the noncontextuality inequalities we have already obtained. Obtaining these noncontextuality inequalities entails two things: identifying all the irrMISCs in the contextuality scenario and calculating their upper bounds due to noncontextuality, i.e., β(Γ, q). Since these are 2-regular scenarios, calculating the upper bounds is easy (due to Theorem 6).
Before we proceed with the general result, we need the following definitions: Edge cover: An edge cover of a graph is a set of its edges such that every vertex of the graph belongs to at least one of the edges in this set. Minimal edge cover: An edge cover such that no proper subset of it is an edge cover is called a minimal edge cover. Every minimum edge cover is minimal, but not conversely. For example, K 3,3 has a minimal edge cover Below, we prove some properties of minimal edge covers of K m,n before moving on to a characterization of irrMISCs in K m,n . Proof. For a given K m,n , let's call the set of m vertices S m and the set of n vertices S n . The size, N , of an edge cover of K m,n must satisfy where v denotes a vertex of K m,n and deg(v) denotes the degree of the vertex, i.e., the number of edges in which it appears. Note that two vertices connected by an edge in a minimal edge cover cannot both have degree > 1: if an edge cover has a pair of degree 2 vertices connected by an edge, then the edge cover cannot be minimal since the said connecting edge can be dropped while maintaining the edge cover property. (See Fig. 12.) This means that a minimal edge cover of K m,n is such that any vertex of degree 2 or more is only connected to degree 1 vertices, hence the minimal edge cover is a disjoint union of connected bipartite subgraphs of type K 1,b or K a,1 , where a ≤ m, b ≤ n. Denoting the set of vertices of each subgraph by V i , we have that number of edges in such a subgraph is |V i | − 1. Thus, the total number of edges in a minimal edge cover, where κ is the number of disjoint subgraphs whose union yields the minimal edge cover. Clearly, 1 ≤ κ ≤ min{m, n}, where κ = 1 corresponds to the case of any K m,n graph with m = 1 or n = 1 since it is its own minimal edge cover and we have N = m + n − 1. For any other K m,n (with m, n ≥ 2), we have that κ ≥ 2 and the maximum size of a minimal edge cover is m + n − 2: this is achieved when a vertex v min ∈ S min{m,n} is connected to all but one (say, v max ) of the vertices in S max{m,n} . The remaining vertex v max ∈ S max{m,n} is then connected to all vertices of S min{m,n} except v min ∈ S min{m,n} . We then have N = m + n − 2.
The total number of minimum edge covers can be computed as follows: every vertex in S min{m,n} is connected one-to-one via an edge to a vertex in S max{m,n} and there are max{m,n}! |m−n|! possible ways to do this. For each such way, each of the remaining |m − n| vertices in S max{m,n} can be connected via an edge to one of the min{m, n} vertices of S min{m,n} , so there are (min{m, n}) |m−n| possible configurations for the remaining edges. This yields a total of max{m,n}! |m−n|! (min{m, n}) |m−n| minimum edge covers for K m,n .

We leave the general case as an open question:
What is the number of minimal-but-not-minimum edge covers for an arbitrary complete bipartite graph K m,n , where m, n ≥ 2? 2. for m ≥ 3 and n ≥ 3: the corresponding set of edges in K m,n is an edge cover of K m,n .
Proof. Case 1, i.e., m = 1 or n = 1 (and odd mn > 1): In this case, we have that every choice of 3 edges in K m,n is a claw and therefore induces a 3-hypercycle extremal probabilistic model on 2Reg(K m,n ) that renders all the remaining (mn − 3) contexts in 2Reg(K m,n ) deterministic. To form a MISC, then, requires at least mn − 3 + 1 = mn − 2 contexts. (See Fig. 11 for a K 1,7 example.) Figure 12: Why two vertices connected by an edge in a minimal edge cover of a bipartite graph cannot both have degree greater than 1. Case 2, i.e., m ≥ 3 and n ≥ 3 (and odd mn > 1): Every MISC of 2Reg(K m,n ) corresponds to an edge cover of K m,n : We show this by proving the contrapositive. If a set of edges is not an edge cover of K m,n , then there exists a vertex in K m,n (not covered by the set of edges) that can support a claw (subgraph K 1,3 of K m,n ) which corresponds to a 3-hypercycle in 2Reg(K m,n ), odd mn > 1. All the contexts in the corresponding set of contexts can then be made deterministic relative to an extremal probabilistic model induced by this 3-hypercycle (cf. Theorems 6 and 7), hence the set cannot be a MISC. (See Fig. 7.) Every edge cover of K m,n corresponds to a MISC of 2Reg(K m,n ): If a set of edges forms an edge cover of K m,n , then there does not exist any vertex in K m,n that can support a claw disjoint from the set of edges. Hence, it is not possible to find a 3-hypercycle extremal probabilistic model on 2Reg(K m,n ) that makes all the contexts in the set deterministic: at least one of the contexts must belong to a 3-hypercycle in any extremal probabilistic model induced by such a hypercycle. The set of contexts must therefore be a MISC. 13  13 Recall that we need to restrict ourselves to 3-hypercycle extremal probabilistic models to identify MISCs in 2Reg(Km,n) scenarios: any bigger odd hypercycles would make even fewer contexts deterministic than a 3-hypercycle extremal probabilistic model and lead us to an artificially lower bound on the noncontextuality inequality; we have to give a noncontextual model as much leeway as mathematically possible to reproduce perfect predictability and thus find an upper bound that cannot be exceeded by any noncontextual ontological model, not merely those using extremal probabilistic models induced by 5 or higher odd hypercycles. 2. for m ≥ 3 and n ≥ 3: the corresponding set of edges in K m,n is a minimal edge cover of K m,n , i.e., an edge cover such that none of its proper subsets is an edge cover.
Proof. This just follows from noting the definition of an irrMISC and Theorem 9: an irrMISC is a MISC that does not contain another MISC as a proper subset.
Hence, a minimum edge cover always corresponds to an irrMISC, but not conversely. The 3-context irrMISCs of 2Reg(K 3,3 ) form minimum edge covers (of size 3) but the 4-context irrMISCs of 2Reg(K 3,3 ) don't form minimum edge covers. Thus, the smallest irrMISCs correspond to minimum edge covers, which for K m,n (odd mn > 1) are of size max{m, n} (cf. Theorem 8). See Figs. 9 and 10.
When m = n, the smallest irrMISCs correspond to the perfect matchings 14 of K m,m and thus there are m! such irrMISCs. Recall that for K 3,3 , there are 6 such irrMISCs. The other irrMISCs are minimal edge covers that are not minimum. The number of these minimal-but-not-minimum edge covers is 9 for K 3,3 : every minimal-but-notminimum edge cover of K 3,3 should contain at least one vertex of degree 2 because no vertex can be degree 3 if the edge cover is minimal and the edge cover would be a minimum edge cover if all vertices are degree 1; there are three possible choices for a degree 2 vertex among {1, 2, 3} and for each such choice there are three possible choices of pairs of vertices connected to it given by {{1,2}, {2,3}, {1,3}; choosing these fixes a minimal-but-not-minimum edge cover and we therefore have 3 × 3 = 9 irrMISCs arising from these edge covers. Note that no larger irrMISCs exist for K 3,3 since N ≤ m + n − 2 = 4. In Fig. 14, we illustrate minimum and minimal-but-not-minimum edge covers for some classes of complete bipartite graphs.
7 Can these noise-robust noncontextuality inequalities be saturated?
Yes. For each noise-robust noncontextuality inequality obtained from a KS-uncolourable contextuality scenario, given by we can construct a noncontextual ontological model that saturates it. Such a model obviously also satisfies (if not saturates) any other noise-robust noncontextuality inequality. We provide a construction below, following a similar approach to it as in Refs. [10,14]: 1. First, we write Corr q in terms of an ontological model as We do this for every S i , i ∈ {1, 2, . . . , n}, so that we now have Note that in all this, the response functions satisfy the assumption of measurement noncontextuality.
3. Next, we choose ν(λ) in such a way that the inequality is saturated: in particular, we take ν(λ) to be supported entirely over the set so that ν(λ) = 0 for all λ ∈ Λ\Λ max . We then have and the noise-robust noncontextuality inequality is thus saturated.

Discussion and future work
To summarize, we have presented a framework for noise-robust noncontextuality inequalities that are inspired by logical proofs of the Kochen-Specker theorem. We have identified special sets of these inequalities, corresponding to irreducible minimally indeterministic sets of contexts (or irrMISCs), that are independent of each other and can generate any other noise-robust noncontextuality inequality corresponding to a minimally indeterministic set of contexts (MISC) or even any other (non-MISC) set of contexts. The basic building blocks of any noise-robust noncontextuality inequality obtained in this framework are the noise-robust noncontextuality inequalities for irrMISCs. Along the way, we also obtained a parameterization of contextuality scenarios and identified ways to detect their KS-uncolourability.
Note that the contextuality scenarios we have considered within this framework are all required to be KSuncolourable and this is the only restriction on them. 15 In particular, we do not insist that they admit a realization with KS sets of projectors in quantum theory. When they do admit such a realization, we have that quantum theory can violate any noise-robust noncontextuality inequality by achieving Corr q = 1 in the ideal noiseless limit. When they don't admit realization with KS sets then it remains an open question whether noise-robust noncontextuality inequalities of type Corr q ≤ β(Γ, q) might still admit a quantum violation using nonprojective positive operatorvalued measures (POVMs).
We conclude with some directions for future work building on this framework.

POVM realizations when no KS sets exist:
Do there exist KS-uncolourable contextuality scenarios that do not admit KS sets but allow for a realization with POVMs in such a way that a noise-robust noncontextuality inequality can be violated? If so, construct some examples and determine the optimal POVM realization for them.
Note that since POVMs can realize arbitrary joint measurability structures in quantum theory [29], it is conceivable that some such realizations could work for violating noise-robust noncontextuality inequalities of the type obtained in this paper even when the KS-uncolourable contextuality scenario concerned does not admit a realization with KS sets.

Lower-dimensional POVM realizations when KS sets exist:
For KS-uncolourable contextuality scenarios that do admit KS sets, find a realization with POVMs that works on a Hilbert space of lower dimension than the KS set realization and still violates a noise-robust noncontextuality inequality. Is it possible, for example, to realize the 18 ray scenario [19] (Fig. 2) with four-outcome qubit POVMs such that an irrMISC noncontextuality inequality is violated?
3. Other KS constructions and extremal probabilistic models on their contextuality scenarios: We have only considered KS-uncolourable contextuality scenarios of type 2Reg(K m,n ) in detail in this paper.
It would be interesting to analyze the original Kochen-Specker construction [1] and others that do not fall within the family of contextuality scenarios we have considered here. In particular, the parameterization of KS-uncolourable scenarios in Section 4 should be useful in attempting such analyses. Note that our analysis in this paper relied heavily on the characterization of extremal probabilistic models on 2Reg(K m,n ) contextuality scenarios given by Theorem 6. More generally, it relied on the characterization of extremal probabilistic models on arbitrary contextuality scenarios given by AFLS [24] (see Theorem 1). The characterization of Theorem 1 will be useful in attempting to study contextuality scenarios that do not fall under the purview of Theorem 6. Indeed, it would be worthwhile to obtain a characterization of contextuality scenarios that admit unique probabilistic models since these are the ones that induce extremal probabilistic models on any contextuality scenario following Theorem 1.

Applications to quantum information protocols:
Given that the framework we have proposed leverages KS-uncolourability to provide noise-robust operational signatures of contextuality, there is good reason to expect that it might be relevant for quantum information tasks. In particular, it could be used to study the question of how logical proofs of the KS theorem can be leveraged to provide advantages in (possibly some variant of) state discriminationà la Ref. [30] (which does not use contextuality arising from Kochen-Specker proofs). In particular, the task of maximizing the average source-measurement correlation (the quantity Corr q ) could possibly be related to minimum error state discrimination as follows: for any d-uniform contextuality scenario with n contexts, we consider n ensembles of states (each denoted by source setting S i ) such that the average preparation procedures associated with them ([ |S i ]) are all operationally equivalent and we apply the assumption of preparation noncontextuality relative to this operational equivalence. Our task then is to discriminate between elements of each ensemble under an additional constraint on the set of allowed measurements, namely, that they satisfy the operational equivalences required for KS-uncolourability and we apply measurement noncontextuality relative to these operational equivalences. Maximizing Corr q then corresponds to maximizing the average success probability 15 The case of KS-colourable contextuality scenarios that fit within a generalization of the CSW framework [23] was considered in Ref. [14]. These KS-colourable scenarios satisfy the property that the set of probabilistic models satisfying consistent exclusivity on them coincides with the set of general probabilistic models on them. The case of KS-colourable contextuality scenarios where this property fails -and which are therefore outside the purview of Refs. [14,23] -will be taken up in future work. of discrimination given by Corr q for the set of ensembles in the support of probability distribution q (defined over the set of n contexts). If the value of Corr q exceeds the bound from our noise-robust noncontextuality inequality, we then have that the operational theory allows a greater success probability for minimum error state discrimination than a theory which admits a noncontextual ontological model.
Note also that the possibility of realizing violations of our noise-robust noncontextuality inequalities using POVMs on lower-dimensional systems (than the ones on which KS sets exist) also makes it interesting to study this problem from such a perspective.