Contextual advantage for state-dependent cloning

A number of noncontextual models exist which reproduce diﬀerent subsets of quantum theory and admit a no-cloning theorem. Therefore, if one chooses noncontextuality as one’s notion of classicality, no-cloning cannot be regarded as a nonclassical phenomenon.

A number of noncontextual models exist which reproduce different subsets of quantum theory and admit a no-cloning theorem. Therefore, if one chooses noncontextuality as one's notion of classicality, no-cloning cannot be regarded as a nonclassical phenomenon. In this work, however, we show that there are aspects of the phenomenology of quantum state cloning which are indeed nonclassical according to this principle. Specifically, we focus on the task of state-dependent cloning and prove that the optimal cloning fidelity predicted by quantum theory cannot be explained by any noncontextual model. We derive a noise-robust noncontextuality inequality whose violation by quantum theory not only implies a quantum advantage for the task of state-dependent cloning relative to noncontextual models, but also provides an experimental witness of noncontextuality.
An important guiding principle for quantum theorists is the identification of genuine nonclassical effects certified by rigorous theorems. Given a quantum phenomenon, the relevant question is: Are there classical models able to reproduce the observed operational data? Here we investigate this question in the context of a cloning experiment.
The no-cloning theorem [1][2][3] is widely regarded as a central result in quantum theory. Informally, the theorem states the impossibility of copying quantum information, and is contrasted with the fact that classical information, on the other hand, can be perfectly copied. More precisely, there is no machine (formally, a quantum channel) that can take two distinct and nonorthogonal states {|ψ 1 , |ψ 2 } sent at random as inputs and output the corresponding copies {|ψ 1 ⊗ |ψ 1 , |ψ 2 ⊗ |ψ 2 } [4].
While no-cloning is often regarded as an intrinsically quantum feature, one would like to back that claim by a precise theorem stating what operational features cannot be explained within classical models. The theorem should hence define a precise notion of 'classicality' and show that such notion leads to operational predictions incompatible with the relevant quantum statistics [5]. At the operational level, we can schematically think of an experiment as a set of black-boxes each corresponding to certain sets of operational instructions. At the ontological level we look for theoretical explanations of the empirical data within the framework of ontological models. This is a very broad class of models involving an arbitrary set of physical states evolving according to some laws and ultimately determining (the probabilities of) the measurement outcomes. This analysis forces us to look for any plausible alternative explanation of the empirical data collected in a quantum experiment before we certify it as "nonclassical". But, which ontological models should be deemed "classical"?
Clearly, the broader the chosen notion of classicality is, the stronger the resulting no-go theorem is. Since the scenario of quantum cloning does not feature space-like separated measurements, we need a different notion of 'classicality' than the ubiquitous Bell's locality. Hence, in this work we identify nonclassical features as those that cannot be explained within any noncontextual model, in the generalized sense introduced in Ref. [6]. It is a known fact that, with respect to this broad notion, no-cloning by itself should not be regarded as a nonclassical phenomenon. There are, in fact, several examples of noncontextual models for subsets of quantum theory with a no-cloning theorem [7,8]. The mechanism behind no-cloning in noncontextual theories is simple: non-orthogonal quantum states |ψ 1 , |ψ 2 correspond to overlapping probability distributions µ 1 (λ), µ 2 (λ) over the posited set of physical states λ and there is no deterministic nor stochastic process mapping {µ 1 , µ 2 } to {µ 1 ⊗ µ 1 , µ 2 ⊗ µ 2 } [9]. The existence of these models proves that no-cloning cannot be interpreted as a nonclassical phenomenon when the notion of classicality is taken to be that of noncontextuality. 1 Hence, we need to look more closely at the phenomenology of quantum cloning if we are to identify aspects of it that are nonclassical according to the principle of noncontextuality.
In this work, we identify a strongly nonclassical aspect in the ultimate limits of imperfect cloning. The question of what is the best fidelity with which a given set of quantum states can be cloned has been widely studied since the pivotal work of Bužek and Hillary in 1996 [10] (for a review on quantum cloning, see, e.g., Ref. [11]). We find that the optimal fidelity predicted by quantum theory for the cloning of two distinct, non orthogonal pure states cannot be reproduced by any noncontextual model which complies with the operational phenomenology featured in a quantum cloning experiment. Specifically, contextuality provides an advantage to the maximum copying fidelity. Our result directly links contextuality to a quantum advantage [5,12,13].

Noncontextual ontological models of operational theories
At the operational level, we can schematically think of an experiment as a set of black-boxes each corresponding to certain sets of operational instructions. 2 We can distinguish three kinds of black-boxes: 1. A preparation black-box P s initialises the system; 2. A transformation black-box T takes in a system prepared according to P s and transforms it into some new preparation, denoted T (P s ).

3.
A measurement black-box M s takes a preparation P s as input and returns an outcome x with probability p(x|P s , M s ).
4. An experiment consists of collecting the statistics p(x|T (P s ), M s ) for various choices of the black boxes P s , T and M s .
The set of P s , T , M s and corresponding observed statistics p(x|T (P s ), M s ) are the defining elements of an operational theory. Noncontextuality is a restriction on the ontological models that try to explain the statistics of some operational theory. An ontological model for an operational theory is one which [14]: 1. Makes every preparation P s correspond to sampling from a probability distribution µ s (λ) over some set of ontic variables λ. λs are referred to as 'hidden variables' in the context of Bell nonlocality and they form a (measurable) set Λ.
3. Represents a measurement M s by a response function ξ s (x|λ) giving the probability of outcome x given that the hidden variable takes the value λ (ξ s (x|λ) ≥ 0, x ξ s (x|λ) = 1 ∀λ).
2 While empirical data is always to some degree theory-laden, the word "operational" here signifies that we are striving towards the ideal of the most low-level instructions we can imagine (e.g. press this button, write down an outcome when a corresponding light flashes etc.). This is to be opposed to high-level instructions that refer to theoretical entities, such as "lower the potential barrier in which the electron is trapped". Figure 1: Cloning experiment. Top: black-box of the cloning protocol; one of two preparation procedures Px, x = a, b is performed with equal probability, the resultant state is sent through a cloning machine (independent of x), which respectively prepares Pγ, γ = α, β; a test measurement Mxx for the target preparation Pxx is performed and passed with probability P (Maa|Pα) (or P (M bb |P β )). Bottom: ontological description of the same experiment, where preparing Px corresponds to sampling λ with probability µx(λ), the cloning machine maps λ → λ with probability T (λ |λ) and Mxx gives a 'pass' outcome with probability ξxx(1|λ ).
An ontological model then defines its predictions as (1) Two operational procedures (be them preparations, measurements or transformations) are said to be operationally equivalent if they cannot be distinguished by any experiment. Noncontextuality, in the generalized form introduced in [6], is a restriction to ontological models requiring that if two procedures are operationally equivalent, they must be represented by the same object in the ontological model. This notion can be seen as an extension of the traditional one of Kochen-Specker [6,15].
In this work we will be concerned with operational equivalences only at the level of preparations. Two preparations P s and P s are operationally equivalent if they cannot be distinguished by any measurements: which, for short, we will denote by P s P s . The assumption of (preparation) noncontextuality is then This principle can be understood as an 'identity of the indiscernibles' and, together with locality, it can be seen as a successful methodological principle for theory construction [16]. Examples of noncontextual ontological models include classical Hamiltonian mechanics, Hamiltonian mechanics with a resolution limit on phase space [7] and Spekken's toy model [8].

Operational features of quantum cloning -ideal scenario
We now describe the operational features of optimal state-dependent quantum cloning which, as we will show, are impossible to explain with noncontextual models (see also Fig. 1). We will make the assumption that certain perfect correlations are observed, but we will later remove these idealizations. For all twooutcome measurements M s we will use the shortcuts Let P a and P b denote the experimental procedures followed to prepare the states |a and |b to be cloned. As an operational signature of the fact that |a and |b are two pure and, in general, nonorthogonal states, we consider the 'test measurements' M a , M b , with outcomes x ∈ {0, 1}, giving the operational statis- In the quantum formalism, this statistics is reproduced by performing the projective measurements {|a a| , 1 − |a a|} and {|b b| , 1 − |b b|} (with x = 1 corresponding to the first outcome). We will use the notation c ab := p(M b |P a ), which is called 'confusability' in Ref. [5], for the probability of observing the first outcome of the M b measurement when the system is initialized according to P a . Clearly, in the ideal quantum experiment one observes c ab = | a|b | 2 .
The two preparations P a , P b go through a cloning machine T , which outputs new preparations P α = T (P a ), P β = T (P b ). In quantum theory, the optimalstate dependent cloning operation is a unitary U and, hence, the preparations P α and P β correspond to pure states |α := U |a 0 , |β := U |b 0 respectively, with |0 the initial state of some ancillary register. Operationally, and similarly to the discussion above, the purity of the outputs implies that we can perform test measurements M α , M β satisfying p(M α |P α ) = 1, p(M β |P β ) = 1 (again, by performing the measurements described in the quantum formalism as {|α α| , 1 − |α α|} and {|β β| , 1 − |β β|}).
The experiment ends by testing what the fidelity between the output and the ideal clone is. To do so, given the ideal clones P aa , P bb we introduce test-measurements M aa , M bb and assume one ob- In a quantum experiment this is realized by preparing states |aa , |bb and performing the projective measurements {|aa aa| , 1 − |aa aa|}, {|bb bb| , 1 − |bb bb|}.
Then, denoting by c αaa := P (M aa |P α ), c βbb := P (M bb |P β ), the (global) cloning fidelity is operationally defined to be i.e., the average probability that the imperfect clones P α and P β pass the corresponding test measurements for the ideal clones, M aa and M bb respec-tively. In quantum theory, the optimal cloning unitary achieves [17] F Q,opt with c ab = | a|b | 2 . This brief summary captures the main operational features of the traditional 'optimal state-dependent cloning' and highlights the main issue with this approach: it leaves no room to leverage operational equivalences to further study its potential nonclassical aspects. To fix that, we follow Ref. [5] and exploit another operational consequence of the purity of |a , |b : the existence of preparations P a ⊥ , = 0 and such that the mixture P a /2 + P a ⊥ /2 (tossing a fair coin and following either P a or P a ⊥ ) is operationally equivalent to the In the idealized quantum experiment one observes this operational statistics by preparing pure states |a ⊥ , |b ⊥ in the span of {|a , |b } and satisfying a a ⊥ = b b ⊥ = 0 as well as 1 2 To conclude, here is an operational account (without any reference to quantum theory) of the features that we demand are observed in the idealized scenario of the cloning experiment: there exists P s , P s ⊥ , M s such that

Optimal cloning is contextual -ideal scenario
In any ontological model, a cloning experiment is described as follows (see Fig. 1). A preparation device randomly prepares either P a or P b , i.e., it samples a λ from either the distribution µ a (λ) or µ b (λ). This state is sent into the cloning machine that maps λ into some new λ with probability T (λ |λ). For example, if λ = (x 1 , p 1 ) one could have λ = (x 1 , p 1 , x 2 , p 2 ). This λ is sent into a testing device doing the measurement M aa if P a was prepared, or M bb if P b was prepared. Upon receiving λ , the device gives an outcome x with probability ξ aa (λ ) or ξ bb (λ ).
The assumption of noncontextuality (more precisely, preparation noncontextuality [6]) and linearity applied to the operational equivalences in O2 requires that any noncontextual ontological model must satisfy (see Eq. (2)) Our main result is that no noncontextual ontological model can reproduce the operational features listed O1-O2 and match the optimal cloning fidelity predicted by quantum theory. More precisely: Theorem 1 (Optimal cloning fidelity in noncontextual models). Let P α = T (P a ), P β = T (P a ) be the achieved outputs of a cloning process with inputs P a , P b and target outputs P aa , P bb . Suppose one observes the operational features O1-O2. Then, for any noncontextual model we have that Proof. The first part of the proof essentially follows the argument given in Ref. [6] Sec. VIIIA and reproduced in Ref. [5] Sec. IVA, slightly adapted to use the fewer assumptions of the statement. We have that where S k denotes the support of µ k . From this equation, it follows that ξ k (λ) = 1 almost everywhere on S k (that is, modulo sets of measure zero). Furthermore, from which it follows that ξ k (λ) = 0 almost everywhere on S k ⊥ . Hence, S k ∩ S k ⊥ = ∅ modulo sets of zero measure. The operational equivalence of assumption 1 implies that in a noncontextual model Hence, using the facts above, the 1 norm distance between µ s and µ s reads Note that the last integral can be extended to Λ. In fact, by contradiction suppose that ξ s (λ) = 0 for some nonzero measure set X ⊆ S s \S s . Then, from Eq. (6), it follows that, for almost all λ ∈ X, 0 < µ s (λ) = µ s ⊥ (λ). However, as we discussed ξ s (λ) = 0 almost everywhere on S s ⊥ , which gives the desired contradiction. Hence the integral can be extended to S s ∪ S s and, trivially, to all Λ. In conclusion, (7) where c ss = p(M s |P s ). Using the triangle inequality, By definition, µ α (λ) = dλ T (λ|λ )µ a (λ ), for a stochastic matrix T (λ|λ ).
Similarly, µ β (λ) = dλ T (λ|λ )µ b (λ ), with the same stochastic matrix. Since dλT (λ|λ ) = 1 and T (λ|λ ) ≥ 0, one can readily verify from the convexity of the absolute value that µ α − µ β ≤ µ a − µ b (data processing inequality), which implies We can apply Eq. (7) to each of the couples (s, s ) on the right hand side of Eq. (8), obtaining with R 1 := {λ ∈ S aa ∩ S bb : µ aa (λ) ≥ µ bb (λ)} and R 2 := (S aa ∩ S bb )\R 1 . Next, where the first inequality follows from µ aa (λ) ≥ µ bb (λ) ∀λ ∈ R 1 and the second equality follows from ξ bb (λ) = 1 almost everywhere in S bb . Finally, substituting this in Eq. (9) and rearranging the terms gives and since F g = 1 2 c αaa + 1 2 c βbb the global cloning achieved by non-contextual ontological models that comply with the operational features O1-O2 is upper bounded as in Eq. (5). In Fig. 2 we compare the optimal quantum cloning (global) fidelity of Eq. (3) with the maximum noncontextual cloning fidelity of Eq. (5), taking into account that, in quantum experiments, one observes c aa,bb = c 2 ab . One can see, for any 0 < c ab < 1, that quantum mechanics achieves higher copying fidelities than what is allowed by the principle of noncontextuality. Hence, the phenomenology of optimal cloning cannot be reproduced within noncontextual ontological models. Contextuality provides an advantage for the maximum copying fidelity. 3 Interestingly, the above derivation also gives an alternative, simple proof of the main result of Ref. [5]. In fact, an intermediate technical result in the proof of Theorem 1 is that in the presence of the operational features O1-O2, noncontextual models must have a direct relation between the experimentally accessible confusabilities c ss = p(M s |P s ) and the 1 distance between the corresponding probability distributions: (This was implicitly shown in Ref. [5] Sec. IVA, but using infinitely many extra operational assumptions. That is, they assume O2 for all pairs of orthogonal states).
Since the maximum probability s ab of distinguishing two preparations P a and P b is at most 1/2+ µ a − µ b /4, it immediately follows s ab ≤ 1 − c ab /2, which is the optimal state discrimination probability in noncontextual models, as given in Ref. [5]. Conversely, it is not immediately obvious how the techniques of Ref. [5] could be adapted to obtain our result on cloning, due to our use of the data processing inequality in Theorem 1.
We also note that the noncontextual bound on cloning is tight. Denote by S s the support of µ s . Consider a model in which µ ss = µ s µ s and ξ s (λ) = 1 if 3 Of course, when c ab = 0 -as it is for classical, i.e., orthogonal, states -both the quantum and the noncontextual fidelities are 1. λ ∈ S s and zero otherwise. A cloning strategy that saturates the bound is as follows: if the input λ is in S a \S b , output (λ, λ ), with λ sampled according to µ a ; otherwise, output (λ, λ ) with λ sampled according to µ b . Notice that this sets µ β = µ b µ b and, hence, c βbb = 1 (µ b is copied perfectly). On the other hand, µ α (λ, λ ) = µ a (λ)µ a (λ ) for λ ∈ S a \S b and µ α (λ, λ ) = µ a (λ)µ b (λ ) for λ ∈ S a ∩ S b and, hence, ab , where, in the last equality, we use the operational fact that c ab = c ba . Finally, this gives In Appendix A we complete this strategy with a concrete choice of µ a , µ b , µ aa ⊥ , µ bb ⊥ , µ α ⊥ and µ β ⊥ complying with O1 and satisfying Eq. (4) for all the operational equivalences in O2.
This optimal strategy seems to suggest the following intuition behind the theorem: our assumption of preparation noncontextuality on the input preparations P a , P b imply that the distributions µ a (λ) and µ b (λ) overlap "too much" (formally, it implies maximal ψ-epistemicity, c ab = S b dλµ a (λ) [18]), hence the cloning performance turns out worse than in quantum mechanics. Furthermore, noncontextuality implies that µ a and µ b coincide on their overlap, which implies a direct relation between c ab and the 1 norm µ a − µ b . Crucially the latter cannot be increased by the cloning machine, since · decreases under post-processing.
However, this mechanism can only be part of the story. First, the cloning performance is not monotonically decreasing with increasing overlap, since for c ab = 1 one can clone perfectly. Second, cloning is defined as the creation of two independent copies of the preparations P a or P b , but these do not necessarily correspond to two independent copies µ a µ a , µ b µ b (this assumption, which we do not make, is called preparation independence [19]). Nevertheless, we showed that a no-go theorem results from the observed overlaps c ab , c aa,bb and noncontextuality assumptions only as a consequence of information processing inequalities and the triangle inequality.
We note in passing that our proof technique can be abstracted and applied to other tasks as follows: 1. First, given a set of observed overlaps {c ss }, noncontextuality applied to the operational equivalences 1 2 P s + 1 2 P s ⊥ 1 2 P s + 1 2 P s ⊥ gives the equations (10).
2. Second, verify if the equations (10) are compatible with triangle and data processing inequalities and the performance of quantum protocol under consideration (in this case, state-dependent cloning).
In fact, the same proof technique can be extended to nonideal scenarios (with Eq. (10) replaced by Eq. (12)), as we now see.

Optimal cloning is contextual -beyond idealizations
Theorem 1 is a no-go result for noncontextual ontological models aimed at explaining the phenomenology of state-dependent quantum cloning. However, the inequality derived in Eq. (5) is not a proper noncontextuality inequality because the operational features considered refer to an idealized experiment. In any real experiment, on the other hand, one will need to confront the following nonidealities: • The correlations in O1 will only approximatively hold in data collected in a real experiment.
Theorem 2 below extends Theorem 1 beyond the ideal limit, allowing for the observation of nonperfect correlations in O1, such as those generated by a cloning experiment carried out with nonideal preparations and test measurements. As we will discuss later, there are general techniques to deal with the idealization in O2, so that the problem of deriving an experimentally testable statement reduces to the elimitation of the idealization in O1. Specifically, we want to weaken it to where 'ni' stands for 'non-ideal'.
where Err = 1 2 ( b + 2 bb + aa ). Note that, while we gave an independent and simpler proof of Theorem 1, we can now see it as a corollary of the result above once all error terms are set of zero. Another interesting case is when all error terms are equal, b = bb = aa := , which gives F NC,ni g = 1 − c ab 2 + c aa,bb 2 + 2 . In fact, we can give a slightly stronger and symmetric bound than the above. For the specific form, see Appendix B.
The proof of Theorem 2 follows the same lines as that of Theorem 1. The key addition is to extend Eq. (10) to the noisy setting. Specifically, we show that in the presence of the operational features O1ni-O2, noncontextual models must satisfy and similarly if we exchange s and s . In other words, the relation of Eq. (10) holds approximatively, and we can bound its violation with the experimentally accessible noise level. The proof of this result is more involved than in the ideal scenario, so we postpone the derivation to Appendix B. Eq. (12) imposes a strict relation, in any noncontextual model and beyond the ideal scenario, between the 1 distance of two epistemic states and their operationally accessible confusability. Hence, we anticipate that these relations will be of broader use to identify quantum advantages beyond state-dependent cloning. For instance, following the same reasoning given after Theorem 1, these inequalities provide an alternative and intuitive derivation of the tight noise-robust noncontextual bound on state discrimination of Ref. [5],

An explicit noise model
Having derived a noise-robust version of our noncontextual bound, the next step is to investigate whether quantum mechanics violates it. We consider a standard noise model in which the ideal quantum preparations, measurements and unitary transformation are all thwarted by a depolarizing channel N v with noise level v ∈ [0, 1]: A direct calculation (see Appendix C) shows that this sets = v(31 − 21v + 9v 2 )/16 in Eq. (11). If one uses the unitary transformation that is optimal for state-dependent cloning in the noiseless setting, one gets a quantum strategy whose global average fidelity reads which coincides with the optimal for v = 0. For v > 0, however, and unlike in the ideal case, the tradeoff between c ab and F g is not necessarily above the noncontextual bound. For example, for v = 0.015 a violation can be observed only for c ab ∈ [0.318, 0.718], see Fig. 3. Nevertheless, a preliminary comparison with the experimental results of Ref. [20] suggests that the required low level of noise is not beyond current experiments. In fact, in terms of the parameter  Figure 3: Noise-resistance of the quantum advantage in cloning. This plot shows the maximum value of the noise parameter v of a depolarizing channel (affecting preparations, measurements and transformation) for which the quantum value of the cloning fidelity (Eq. (13)) is above the noncontextual bound, as a function of the confusability between the inputs c ab .

Remaining assumptions
As we mentioned, the only remaining idealization is the operational verification of O2.
Let us suppose that, in an experiment, after doing tomography, 4 one determines that the actual experimental realizations of the ideal preparations are P and P (1) β ⊥ . These 'primary' preparations will, in general, not respect the required operational equivalences in O2, due to unavoidable imperfections in the experimental realisation. Luckily, there are general considerations to tackle this idealization [20].
The first thing to notice is that if one can experimentally achieve a set of preparations P (1) s , then one can also prepare any convex combination of them, i.e. any preparation in the convex hull C of the preparations P (1) s . By the linearity of Eq. (1), one can then compute the measurement statistics of all the preparations in C. Therefore, as put forward in Sec. IV of Ref. [20], to go ahead with the experimental verification of Theorem 2 one only needs to find 'secondary' preparations P (2) s in C whose measurement statistics satisfy the operational equivalences in O2; that is, we only need O2ni O2 is satisfied for some preparations P This post-processing hence allows one to apply Theorem 2 even if the collected data does not satisfy O2. One can think of the secondary preparations as noisy versions of the primary preparations. Hence, the price 4 We will later discuss the assumption that can one access a tomographically complete set of measurements. one pays in this construction is that the corresponding noise parameters s = p(M s |P (2) s ) in O1ni will in general be larger. Note that, even if s is too large compared to s to see any violation in Theorem 2, one can get around this issue by adding extra experimental preparations P (1) extra to enlarge C, as explicitly done in Ref. [20]. To summarize, there are good general tools to deal with imperfections in the operational equivalences O2.
As a final remark, it is useful to briefly talk about loopholes. These are all those assumptions that cannot be conclusively tested by any experimental means. In a nonlocality experiment, for example, these include the assumption that the two sides cannot communicate and the ability to choose the measurement freely, i.e., independently of any other variable relevant to the experiment. In a contextuality experiment the notion of operational equivalence relies on the knowledge of a tomographically complete set of measurements. However, if quantum theory is not correct, the tomographically complete set of a postquantum theory may contain extra unknown measurements (just like a future theory may allow signalling). Recent work has shown that the problem can be mitigated by the addition of extra (known) measurements and preparations (see Ref. [21]), but this goes beyond the scope of the present work.

Conclusions and open questions.
We have shown that the operational statistics observed in the optimal state-dependent quantum cloning is incompatible with the predictions of every noncontextual ontological model. In particular, for given overlap, the noncontextual global cloning fidelity is strictly smaller than the quantum prediction. A similar result continues to hold in more realistic experiments which are unavoidably affected by noise (while the effect can be 'washed out' by excessive experimental imperfections). This identifies contextuality as the resource for optimal state-dependent quantum cloning.
From a foundational point of view, it would be relevant to explore whether the relation between contextuality and cloning fidelity, that we proved for optimal state-dependent cloning, extends to the other types of imperfect cloning studied in the literature, mainly phase-covariant and/or universal cloning, as well as to probabilistic cloning [11]. From an applications' point of view, one important open question is if our noncontextual bound can be used to prove a contextual advantage for quantum information processing tasks which rely on optimal quantum state-dependent cloning (e.g., [22,23]

A Noncontextual model saturating the bound in Theorem 1
To complement the cloning strategy given in the main text, in this section we give a concrete choice of distributions µ aa , µ bb , µ aa ⊥ , µ bb ⊥ and µ α ⊥ satisfying the operational features targeted by Theorem 1. The supports of all these distributions, which we set to be subsets of [0, 2] × [0, 2], are plotted in Figure 4. All the distributions are constantly 1 on their support. Notice that since the cloning map given in the main text makes µ β ≡ µ bb , it follows that to satisfy the operational equivalence for (µ β , µ bb ) we must have µ β ⊥ ≡ µ bb ⊥ . We let the reader verify, by inspecting the plots, that the remaining requirements implied by the operational features O1 and O2 are satisfied by these distributions (and the choice of response functions made in the main text).

B Generalization and proof of Theorem 2
In this section we will prove a slightly stronger and more symmetric bound on the noncontextual cloning fidelity F NC g from which the bound in Thm. 2 in the main text follows straightforwardly as a corollary.
Then, for any noncontextual model we have that For the proof of Theorem 3, we make use of the following lemma relating the 1 distance of two epistemic states in any ontological model satisfying the hypothesis of the theorem and their operationally accessible confusability.

Lemma 4.
Let P s , P s be preparations. Suppose there exists preparations P s ⊥ , P s ⊥ and a two outcome measurement M s such that Then, in a noncontextual ontological model, Proof. We denote by S s the support of µ s . Define a partition S s ∪ S s = 4 i=1 R i , as summarized in Figure 5: Then, Consider, where we used ξ s ≤ 1 in the first inequality and assumption 1 and non-contextuality in the second equality.
In the third equality, we used R1 dλµ s (λ)ξ s (λ) = 0 and in the final inequality we used assumption 2. Then, using Eq. (16), Furthermore, recalling that ξ s ⊥ = 1 − ξ s , where in the first inequality we used that µ s (λ) ≤ µ s (λ) in R 3 and µ s (λ) ≥ µ s (λ) in R 2 . In the final inequality, we used assumption 2. Hence, we have that Finally, noting that µ s − µ s = µ s − µ s and that the above derivation is symmetric under the exchange of s with s we arrive to the desired result 2 max{1 − c ss − s , 1 − c s s − s } ≤ µ s − µ s ≤ 2 min{1 − c ss + s , 1 − c s s + s }, Notice that for the lower bound in Eq. (17) (and, hence, the left hand side of Eq. (18)) we did not use assumption 1 of operational equivalence.
Given the above we can now prove Theorem 3: Proof of Theorem 3. In the first part we proceed as in the ideal case. From the triangle inequality and the contractivity of the 1 norm under stochastic processes (which gives µ α − µ β ≤ µ a − µ b ), one can show that the following equation holds (see Eq. (8)): Using both upper and lower bounds for the 1 distance derived in Lemma 4, this implies 2 max{1 − c aa,bb − bb , 1 − c bb,aa − aa } ≤ 2(1 − c αaa ) + 2 aa + 2 min{1 − c ab + b , 1 − c ba + a } +2(1 − c βbb ) + 2 bb , which can be rearranged to give the claimed bound on F NC g .
and analogously for χ in {β, bb} and {aa, bb}. Hence, Following the same arguments, it is easy to see that, for the case of symmetric confusabilities, the error term in Eq. (11) becomes, as a function of v,

C.4 Quantum performance
In this last subsection, we compute the global average fidelity F Q g in the noisy setting of the optimal quantum cloner for the ideal setting as a function of the noise channel's parameter v. Hence,