Measurement disturbance and conservation laws in quantum mechanics

Measurement error and disturbance, in the presence of conservation laws, are analysed in general operational terms. We provide novel quantitative bounds demonstrating necessary conditions under which accurate or non-disturbing measurements can be achieved, highlighting an interesting interplay between incompatibility, unsharpness, and coherence. From here we obtain a substantial generalisation of the Wigner-Araki-Yanase (WAY) theorem. Our findings are further refined through the analysis of the fixed-point set of the measurement channel, some extra structure of which is characterised here for the first time.


Introduction
That measurements generally disturb quantum systems is one of the fundamental aspects of quantum mechanics. The consequences of this effect range from the foundational to the applied, sometimes entering in the guise of measurement "back-action", playing a key role in quantum metrology, computation, and information processing [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20]. Measurement disturbance can be seen when two observables are measured in succession, and the statistics of the second measurement depend on the first. While a well-known necessary condition for non-disturbance is that the pair of observables must be compatible [21,22], further necessary conditions arise when the first measurement obeys a conservation law, i.e., when the interaction between the measured system and measuring apparatus conserves some total quantity such as energy, charge, or angular momentum. Indeed, the Wigner-Araki-Yanase (WAY) theorem states that when a single sharp observable is measured in succession, the first measurement will not disturb the second only if the measured observable commutes with the system part of a conserved quantity [23][24][25]. The same constraint holds for perfectly accurate measurements, and independently of disturbance, if the pointer observable of the apparatus obeys the "Yanase condition", i.e., if it commutes with the apparatus part of a conserved quantity [26]. The WAY theorem has evolved over the years and continues to inspire research in a variety of directions (some recent examples are [27][28][29][30][31][32][33][34]), having impact also in other fields of research: for instance in quantum computing [35][36][37], the resource theories of asymmetry [38] and coherence [39,40], the theory of quantum reference frames [41,42], quantum clocks [43], and quantum thermodynamics [44][45][46][47][48]. Despite the progress that has been made, however, the full scope of the WAY theorem is still not known. For instance, the theorem as stated pertains only to sharp observables, and has been shown only in the limited framework of "normal" measurement schemes, where the apparatus is prepared in a pure state and unitarily interacts with the measured system, and where the pointer observable is sharp. But in the quantum theory of measurement [49], observables are more properly represented by positive operator valued measures (POVMs) which can be unsharp, measurement interactions are more generally described by channels which can be non-unitary, and the apparatus preparation may be described by a mixed state. Additionally, the WAY theorem addresses disturbance only in the case where the same observable is measured in succession, and the situation where the first and second observables in the sequence are not the same has received scant attention. In this paper, we shall investigate the role of conservation laws on measurement error and disturbance in the more general setting, stating our results in operational terms, in that the quantitative bounds we employ can be seen to arise from the probabilistic structure of quantum theory in its general measurement theoretic form [50]. The paper is structured as follows. In Sec. 2, we present the elements of operational quantum theory pertinent to our investigation. This includes a background on the quantum theory of measurement, together with a quantification of measurement error and disturbance. Readers familiar with these topics can jump directly to Sec. 3, where the main results of the paper begin; here we present a framework for describing conservation laws in quantum theory, prising apart two distinct notions of conservation-full and average-whose difference manifests for general channels and which plays a key role in interpreting our findings. Next, we consider sequential measurements where the first measurement obeys a conservation law-whether average or full-and obtain general quantitative bounds for the error in the first measurement to realise a desired target observable, and the disturbance by the first measurement on a second, possibly different, observable. Here, we do not assume that the system observables are sharp, or that the apparatus pointer observable is sharp, or that the measurement interaction is unitary, or that the apparatus preparation is pure. In particular, the bounds demonstrate that in the case of a full conservation law, a large coherence in the apparatus preparation is in general a necessary condition for approximately accurate and non-disturbing measurements of observables not commuting with the system part of a conserved quantity. These bounds are then used to prove a generalisation of the WAY theorem, given in the form of a single quantitative bound, and capturing many essential features of the original theorem. Next, we provide an even stronger generalisation of the WAY theorem, indicating a deep connection between measurability, non-disturbance, and "definiteness", and demonstrating that there are unsharp observables not commuting with the conserved quantity whose measurement cannot be accurate or non-disturbing irrespective of the apparatus preparation. Finally, in Sec. 4 we consider how the structure of the set of fixed states of the measurement channel imposes further restrictions on non-disturbance. In particular, we show that an observable not commuting with the conserved quantity admits a non-disturbing measurement only if the measurement channel disturbs all "faithful" states, i.e., states with strictly positive eigenvalues.

Preliminaries
In this section we introduce the elements of operational quantum theory. This includes some background on observables, instruments, and measurement schemes, as part of the quantum theory of measurement (see, e.g., [49][50][51][52]). In particular, an operationally motivated quantification of measurement error and disturbance is provided, together with a review of two special instances of non-disturbing measurements-measurements of the first kind and repeatable measurements. for all A ∈ L(K) and T ∈ T (H). Φ * is completely positive and sub-unital, and unital exactly when Φ is trace-preserving. Unital operations Φ * will also be referred to as channels. In Appendix (A) we present several properties of operations that are of central importance for the proofs of our results, most notably a Cauchy-Schwarz inequality [53]. For channels Φ : T (H) → T (H), and their duals Φ * : L(H) → L(H), we define the fixed-point sets as

Observables
An observable of a quantum system S, with Hilbert space H S , is represented by a normalised positive operator valued measure (POVM) E : Σ → L p (H S ), where Σ is a σ−algebra of subsets of some value space X , representing possible outcomes of a measurement of E. For any X ∈ Σ, the positive operator O ⩽ E(X) ⩽ 1 S is referred to as an effect of E. E is sigma-additive on disjoint elements of Σ, and normalisation implies that E(X ) is the identity operator on H S . An effect E(X) = α1 S , where α ∈ [0, 1], is called trivial, and an observable E is called non-trivial if at least one of the effects in its range is non-trivial. Discrete observables are those for which X = {x 1 , x 2 , . . . } is countable, in which case E can be identified with the set {E(x) ≡ E({x}) ∈ L p (H S ) : x ∈ X } ≡ E. If it is not stated otherwise, observables will be assumed to be discrete. Combined with states, observables give rise to the probabilities p E ρ (x) := tr[E(x)ρ], holding for all ρ ∈ S(H S ) and all x ∈ X , interpreted as the probability of observing outcome x when the observable E is measured in the state ρ. If E is a POVM acting in H S , the commutant of E is denoted by E ′ := {A ∈ L(H S ) : [E(x), A] = O ∀ x ∈ X }. Since E = E * is a self-adjoint set, E ′ is a von Neumann algebra, and E ′′ ≡ (E ′ ) ′ is the smallest von Neumann algebra containing E (i.e., it is the von Neumann algebra generated by E). For any A ∈ L(H S ) such that A ∈ E ′ , we write [E, A] = O. Similarly, for any observable F := {F(y) : y ∈ Y} such that F ⊂ E ′ , we shall write [E, F] = O. Among the observables are those that are commutative, meaning that E ⊂ E ′ (that is, all the effects E(x) mutually commute). Among the commutative observables are the sharp observables, which satisfy the additional condition that for all x, y ∈ X , E(x)E(y) = δ x,y E(x), i.e., E(x) are mutually orthogonal projection operators. These observables correspond to self-adjoint operators through the spectral theorem. Observables that are not sharp will be called unsharp, and similarly any effect E which is not a projection will be called unsharp. The unsharpness of E can be quantified through the operator norm as 0 ⩽ ∥E − E 2 ∥ ⩽ 1/4, which vanishes exactly when E is a projection. Finally, an observable E is defined as being "norm-1", or having the norm-1 property, if ∥E(x)∥ = 1 for every x for which E(x) ̸ = O. While sharp observables are trivially norm-1, this property may also be enjoyed by some unsharp observables. Figure 1: An instrument measures an observable E of the system S, and also transforms the system conditional on registering a given outcome. The system, initially prepared in an arbitrary state ρ, enters the instrument which then registers outcome x with probability p E ρ (x) := tr[E(x)ρ] = tr[Ix(ρ)]. Subsequently, the instrument transforms the system to the (non-normalised) state Ix(ρ).

Instruments
Though the state-observable pairings describe the totality of the measurement statistics, this is not sufficient for determining other interesting properties of a measurement, for instance the form of the associated state changes. To this end, we make use of the notion of instrument, or operation valued measure [56][57][58][59][60]. A discrete instrument is a collection of operations I := {I x ≡ I {x} : x ∈ X } such that I X (·) := x∈X I x (·) is a channel. Throughout, we shall always assume that I acts in H S , that is, I x : T (H S ) → T (H S ). Each instrument is associated with a unique observable E via I * x (1 S ) = E(x), which implies that p E ρ (x) := tr[E(x)ρ] = tr[I x (ρ)]. We refer to such an I as an E-compatible instrument, or an E-instrument for short, and to I X (·) as the associated E-channel. I x (ρ) is interpreted as the non-normalised state after a measurement of E has taken place and the outcome x has been registered, and I X (ρ) is the normalised state after a non-selective measurement. A schematic representation of an instrument is given in Fig. 1. We note that for every discrete observable E, there are infinitely many E-compatible instruments; every Einstrument I can be constructed as the set of operations {I x = Φ x • I L x : x ∈ X } [58,60], where Φ x : T (H S ) → T (H S ) are arbitrary channels that may depend on outcome x, and I L is the Lüders instrument for E [61], defined as to hold for all x ∈ X , T ∈ T (H S ), and A ∈ L(H S ). Figure 2: An E-instrument I is implemented on the system S via a measurement scheme. The system, initially prepared in an arbitrary state ρ, and a measuring apparatus A, prepared in a fixed state ξ, undergo a joint evolution by the channel E. Subsequently, a pointer observable Z of the apparatus is measured. With probability p E ρ (x) := tr[E(x)ρ] = tr[Ix(ρ)] the apparatus registers outcome x, thereby transforming the system to the non-normalised state Ix(ρ).

Measurement schemes
An even more comprehensive description of the measurement process involves the modelling of a measuring apparatus A and a specification of how it couples to the system under investigation. A measurement scheme is characterised by the tuple M := (H A , ξ, E, Z) where: H A is the Hilbert space for the measuring apparatus A and ξ ∈ S(H A ) is a state on H A ; E : T (H S ⊗ H A ) → T (H S ⊗ H A ) is a channel which serves to correlate system and apparatus; and Z := {Z(x) : x ∈ X } is a "pointer" observable of the apparatus. The operations of the instrument implemented by M can be written as where tr A : A schematic representation of a measurement scheme is given in Fig. 2. We note that every E-compatible instrument admits infinitely many normal measurement schemes, where ξ is chosen to be pure, E is chosen to be unitary, and Z is chosen to be sharp [62]. However, unless stated otherwise, we shall consider the more general situation where ξ may be mixed, E may be non-unitary, and Z may be unsharp. We now introduce the unital, completely positive normal conditional expectation Γ ξ : L(H S ⊗ H A ) → L(H S ). Γ ξ , called a restriction map for ξ, is defined as the dual of the isometric embedding (or the preparation map) T → T ⊗ ξ, and satisfies tr[Γ ξ (B)T ] = tr[B(T ⊗ ξ)] for all B ∈ L(H S ⊗ H A ) and T ∈ T (H S ). We may use the restriction map to define the channel Γ E ξ : Using Eq.
(3), we may express the duals of the operations defined in Eq. (2) as In particular, we may write the dual channel as I * X (·) = Γ E ξ (· ⊗ 1 A ). We may also be interested in asking how the apparatus is transformed as a result of the measurement interaction. To this end, we introduce the channel Λ : T (H S ) → T (H A ) and its dual Λ * : L(H A ) → L(H S ), referred to as conjugate channels to I X and I * X , respectively, defined as to hold for all T ∈ L(H S ) and A ∈ L(H A ). That is, Λ(ρ) is the state of the apparatus after it has interacted with the system, when the system is initially prepared in state ρ. On the other hand, for an initial system state ρ, the expected value of A ∈ L(H A ) in the state of the apparatus after the measurement interaction can be obtained by evaluating the expected value of Λ * (A) in ρ.

Quantifying measurement error and measurement disturbance
In order to quantify measurement error and measurement disturbance, we shall first provide a quantification of the difference, or discrepancy, between two effects E and F acting in a generic space H. For any state ρ ∈ S(H), the probabilities that the properties corresponding to E and F are realised can be compared as |tr[ρ(E − F )]|, which can be estimated through repeated measurements of E and F in the state ρ. Given that we wish to quantify the sense in which E and F differ as effects, i.e., independently of the state, it is natural to take the supremum over all states, and note that the right hand side denoting the operator norm, which of course vanishes when E = F and is non-zero otherwise. Eq. (6) gives an operationally motivated-in the sense of being derived directly from the experimental probabilities-quantification of the discrepancy between two effects, which will be utilised in the analysis of measurement error and disturbance. Let us first address the question of measurement error. Note that by Eq. (4) and Eq. (5), the observable that is measured by a scheme M := (H A , ξ, E, Z) has the effects I * ). Now, let E be the target observable, i.e., the observable we wish to measure, which may be different to the measured observable, but has the same value space X . By Eq. (6), the measurement error for each effect of the target observable can be quantified through the operator norm as ∥ϵ(x)∥, where A global quantification of measurement error may thus be defined as the largest error over all effects of E, In the absence of any constraints, perfectly accurate measurements are possible for any target observable E. x ∈ X } and F := {F(y) : y ∈ Y} be two observables acting in H S . Consider the sequential measurement of these observables, as depicted in Fig. 3, where at first E is measured by the instrument I (implemented by some measurement scheme M), and subsequently F is measured. For any initial state ρ ∈ S(H S ), the probability of observing outcome y of F after a non-selective measurement by the E-instrument I is given as That is, the prior E-measurement implies that we perform a measurement of the disturbed observable {I * X (F(y)) : y ∈ Y} in the state ρ. By Eq. (6), the disturbance of each effect of F may be quantified through the operator norm as ∥δ(y)∥, where Note that if I is implemented by the measurement scheme M := (H A , ξ, E, Z), then we may equivalently write . A global quantification of the disturbance of F can then be defined as the largest disturbance over all the effects, and F is non-disturbed by I exactly when δ = 0, which is the case when I * X (F(y)) = F(y) for all y ∈ Y. In other words, I does not disturb F exactly when each F(y) is a fixed point of the E-channel I * X , i.e., F ⊂ F(I * X ). In such a case, for any initial state ρ, a non-selective measurement by I does not affect the subsequent measurement statistics of F. In the absence of any constraints, non-disturbance is always possible when the pair of observables commute; since E ′ ⊂ F(I L X * ) always holds, where I L is the Lüders E-instrument defined in Eq. (1), then a Lüders measurement of E is guaranteed not to disturb any F commuting with E [63]. While the fixed-point set of the E-channel I * X is not always contained in the commutant of E, in Appendix (C) we present some cases where F(I * X ) ⊂ E ′ necessarily holds. For a wider discussion on the relationship between disturbance, commutation, and compatibility, and a quantitative bound relating the minimum disturbance in terms of the commutation between the pair of observables, and the unsharpness of each, see Appendix (D).

Measurements of the first kind, and repeatable measurements
A special instance of a non-disturbing measurement is when an E-instrument I does not disturb E itself, i.e., when E ⊂ F(I * X ). Such measurements are referred to as measurements of the first kind. A subclass of measurements of the first kind are those which are repeatable. Though repeatability is a standard assumption in many textbook treatments of quantum mechanics, that it is a property which a measurement may or may not enjoy appeared already in Wigner's 1952 contribution on the WAY theorem. However, within the general framework presented thus far, repeatability corresponds to a very special form of state change, possible only for a privileged class of observables-an observable E admits a repeatable measurement only if it is discrete [62], and all the effects have at least one eigenvector with eigenvalue 1 [64]. I is a repeatable E-instrument if which implies that The above definition is equivalent to . In other words, if I is a repeatable instrument, then repeated measurements by I are guaranteed (with probability one) to produce the same result. It is straightforward to verify that if a measurement of E is repeatable, then it is also of the first kind, since While the converse relation does not hold in general-a measurement can be of the first kind and not repeatable, such as is the case for a Lüders instrument compatible with a commutative but unsharp observable-in the special case of sharp observables repeatability and first-kindness coincide (Theorem 1 in Ref. [65]). In Appendix (E), we provide a series of results regarding the structure of repeatable instruments.

Generalisation of the Wigner-Araki-Yanase theorem
The Wigner-Araki-Yanase (WAY) theorem is the classic result connecting measurement, conservation, and disturbance. This theorem was formulated by Araki and Yanase in 1960 [25], capturing in a fairly general setting an observation due to Wigner given in 1952 [23,24] regarding spin measurements in the presence of angular momentum conservation. The WAY theorem as formulated in Ref. [26] states that for any discrete sharp observable represented as a self-adjoint operator A not commuting with the system part of a (bounded, additive) conserved quantity, the measurement-described by a normal measurement scheme-cannot be repeatable and must violate the Yanase condition, i.e., the pointer observable of the apparatus must fail to commute with the apparatus part of the conserved quantity. In other words, if the Yanase condition is satisfied, then the measurement cannot be "accurate", in the sense that A is not measured by the scheme. But the WAY theorem does not rule out approximate measurements with approximate repeatability properties, where approximate measurement is understood to mean that the unsharp observable which is actually measured can be made statistically close to A. Therefore, WAY has both a strict impossibility part, along with the provision of conditions under which approximate measurements may be possible; as already hinted at by Wigner's original observation, and subsequently refined by Yanase [66] and Ozawa [67] in the form of quantitative bounds, a normal measurement scheme obeying a conservation law and the Yanase condition can achieve approximately accurate measurements for A only if the uncertainty of the apparatus preparation in the conserved quantity is large. While the WAY theorem has developed over the years, its full scope is still not known. In particular, while much of the previous work around the WAY theorem has focused on the "measurability question"-upon which observables cannot be measured, or can be only approximately measured given the conservation law-the role of disturbance has been much less fully examined. Moreover, previous proofs of the WAY theorem concerned only sharp target observables, and were shown in the limited framework of normal measurement schemes, leaving open the question as to whether the implications of the theorem will carry over to the more general setting. In this section, we shall close this gap. We begin by first introducing two operational definitions of conservation laws for channels-full conservation, and the generally weaker notion of average conservation. Next, we obtain quantitative bounds for measurement error and measurement disturbance in the presence of conservation-whether average or full-in the general setting, i.e., without assuming sharpness of the measured observable, the sharpness of the pointer observable, the unitarity of the interaction or the purity of the apparatus preparation. These bounds allow us to prove a generalisation of WAY, which is presented as a single quantitative bound that contains both the possibility and impossibility statements of the theorem. A further generalisation of the WAY theorem is also provided, this time presented as an equality that serves to further strengthen the strict impossibility statement of the WAY theorem for observables that may be unsharp, but are "definite". The section concludes with a demonstration that by imposing conservation laws on pointer objectification in addition to the measurement interaction between system and apparatus, the measurability part of WAY can be recovered without the Yanase condition. While every E-compatible instrument I admits some measurement scheme M := (H A , ξ, E, Z), any restrictions imposed on the elements of M will in turn restrict the types of instruments that can be implemented, and hence the class of observables E that can be accurately measured, and the class of observables F that will be non-disturbed. One such restriction is given by conservation laws-for example, the interaction channel E between system and apparatus may be restricted so that the total energy, charge, or angular momentum must be conserved. Before investigating how conservation limits measurements, let us first consider two operational definitions of conservation laws for channels, where the conserved quantity N is always assumed to be a bounded self-adjoint operator. In the first analysis, a conservation law can be defined by equality of expectation values before and after the action of the channel, i.e., average conservation:

Measurement schemes in the presence of conservation laws
i.e., N k ∈ F(Φ * ) for all k ∈ N. As shown in Appendix (F), full conservation is in fact equivalent to just the first two moments being conserved, i.e., N k ∈ F(Φ * ) for k = 1, 2. Moreover, full conservation is also shown to be equivalent to "invariance" of the unitary group generated by N under the action of the channel, i.e., Φ * (e itN ) = e itN for all t ∈ R. We note that invariance implies (but is not equivalent to) "covariance", i.e., Φ * (e itN Ae −itN ) = e itN Φ * (A)e −itN for all t ∈ R and A ∈ L(H). While full conservation trivially implies average conservation, however, it is shown that in general a channel may conserve N on average but not fully. Indeed, it is possible for a channel to conserve N on average while not being covariant. Therefore, average conservation is generally a weaker form of conservation law, and is logically distinct from the concept of "symmetry" [68,69]. However, in the special case where Φ(·) := U · U * is a unitary channel, average and full conservation coincide, and are both equivalent to the commutation relation [U, N ] = O. Since a normal measurement scheme uses a unitary interaction channel, it follows that in such cases there is no distinction to be drawn between the two notions of conservation law. But if a measurement scheme is not normal, i.e., if the interaction channel is non-unitary, then the distinction between average and full conservation will no longer be void and, as we shall see, leads to interesting consequences. Throughout what follows, we shall only consider the case where the interaction channel E conserves a quantity N that is a bounded, additive, self-adjoint operator. That is, are respectively bounded quantities of the system and apparatus alone. Note that conservation of an additive N by the interaction channel E does not generally imply conservation of N S by the channel I X , since E may allow for an "exchange" of the conserved quantity between system and apparatus; specifically, average conservation of N by E implies that holds for all ρ ∈ S(H S ), where Λ is the conjugate channel of I X defined in Eq. (5), so that Λ(ρ) is the state of the apparatus after the measurement interaction. We can see that it is possible for the expected value of N S to increase (decrease), provided that the expected value of N A decreases (increases) by an equal amount. Indeed, such a "compensation" by the measuring apparatus is in general necessary for the instrument I to accurately measure some observable E not commuting with the conserved quantity: if E is a sharp observable and the E-channel I X conserves N S on average, by item (i) of Lemma C.1 it holds that E must commute with N S . Additionally, if I X fully conserves N S then by item (iii) of Lemma C.1 E must commute with N S , even when E is unsharp.

Measurement error and disturbance under conservation laws
Consider the case where the system is measured in succession, but where the first measurement is constrained by a conservation law, as shown in Fig. 4. We now present our first main result, providing quantitative bounds for the error of the first measurement in achieving the desired target observable E, and for the disturbance caused by the first measurement on the second observable F. These bounds will be used to obtain our generalisation of the WAY theorem in the sequel. . Then for all x ∈ X it holds that where Γ E ξ is the channel defined in Eq. (3), and Λ is the conjugate channel to I X defined in Eq. (5). Let ∥δ(y)∥ be the disturbance of the effects of an observable F := {F(y) : y ∈ Y} caused by I, as defined in Eq. (8). Then for all y ∈ Y it holds that The proof for the error bound Eq. (11) is provided in Appendix (G), and the proof for the disturbance bound Eq. (12) is given in Appendix (H). In Appendix (D), we also provide similar bounds for disturbance that are independent of conservation, but take into account the commutation between F and the observable that is measured by I. Note that while the upper bounds of both inequalities above (the terms on the right hand side) are structurally the same, the lower bounds (the terms on the left hand side) are not. Specifically, while the constraints on measurement error depend on the choice of pointer observable, the constraints on disturbance are independent of this. It follows that the implications of these inequalities differ markedly. We shall illustrate this by considering when the inequalities impose no constraints, i.e., when the lower bounds vanish.
In such a case, Λ * (A) = A for all A, and so by choosing the pointer observable so that Z = E, we obtain ϵ(x) = Λ * (E(x)) − E(x) = O, and so all target observables are measurable. This is perfectly consistent with Eq. (11), since in such a case Λ * ([E(x), N S ]) = [E(x), N S ], and so the lower bound vanishes. But note that if M is trivial, then the instrument that it implements is also trivial, i.e., it will hold that I * X (·) = tr[·ξ]1 S . In such a case, we have δ(y) = tr[F(y)ξ]1 S − F(y), so all non-trivial observables will be disturbed. Let us now consider Eq. (12). This will not impose any constraints on non-disturbance for F, i.e., δ = 0, if F commutes with the conserved quantity. This is because by complete positivity, it holds that , so that the lower bound of Eq. (12) will vanish and non-disturbance will not be ruled out for F, whether it commutes with N S or not. But by item (iii) of Lemma C.1 such an instrument I will accurately measure E only if [E, N S ] = O. Indeed, as a result of the above arguments, and as shown in Corollary H.2, if the measurement scheme M implements the Lüders instrument I L compatible with an observable E commuting with N S , then non-disturbance will not be ruled out for any observable F that commutes with E. Notwithstanding the special cases discussed above, when E and F do not commute with the conserved quantity, the lower bounds in Theorem 3.1 will not vanish in general, in which case the upper bounds must also not vanish. It follows that a large value of ∥Γ E ξ (N 2 ) − Γ E ξ (N ) 2 ∥ is a necessary condition for achieving an arbitrarily small measurement error for E when [E(x), and an arbitrarily small disturbance for F when [F(y), . If the error and disturbance are to be exactly zero, then E and F must also be unsharp. The term ∥Γ E ξ (N 2 ) − Γ E ξ (N ) 2 ∥ is clearly dependent on the choice of apparatus preparation ξ and, as we show below, under the stronger constraint of a full conservation law this quantity obtains a clearer interpretation as the uncertainty of N A in the apparatus preparation, as quantified by the variance. Lemma 3.1. If the channel E fully conserves an additive quantity N = N S ⊗ 1 A + 1 S ⊗ N A , then If N is fully conserved by E, then by Definition 2 we have E * (N k ) = N k for k = 1, 2. It follows that Recall that the restriction map satisfies Γ ξ (A ⊗ B) = tr[Bξ]A for all A ∈ L(H S ) and B ∈ L(H A ). It follows that Γ ξ (N ) If ξ is a pure state, then a large variance implies a large coherence. This is because Var (N A , ξ) = 0 ⇐⇒ [N A , ξ] = O for pure states. Of course, if ξ is a mixed state then it may still be the case that Var (N A , ξ) is large even if ξ commutes with N A , and hence has zero coherence in the conserved quantity. A quantifier of coherence (or asymmetry) for general states is given by the quantum Fisher information [70][71][72][73], which is equal to four times the convex roof of the variance [74,75]. Let {q i , ϕ i } be an arbitrary ensemble of (not necessarily orthogonal) unit vectors ϕ i ∈ H A , with {q i } a probability distribution. The quantum Fisher information of N A in ξ can be written as Here, P ψ ≡ |ψ⟩⟨ψ| denotes the projection on ψ, and we use the short-hand notation The following demonstrates that a large coherence of the conserved quantity in the initial state of the apparatus, when such a state may be mixed, is a necessary condition for accurate and non-disturbing measurements in the presence of a full conservation law. Proposition 3.1. Consider again the setup of Theorem 3.1, and assume that the interaction channel E also fully conserves N = N S ⊗ 1 A + 1 S ⊗ N A . Then for all x ∈ X it also holds that and for all y ∈ Y it also holds that The proof for Eq. , respectively, independently of the coherence in the apparatus preparation. This is because when the observables are sharp and there is zero error for E or zero disturbance for F, while the upper bounds in Proposition 3.1 may be large, the upper bounds in Theorem 3.1 vanish. Notwithstanding, we see that so long as the apparatus preparation has a large coherence, approximately accurate measurements for E and approximate non-disturbance for F will not be ruled out, even when these observables are sharp.

The generalised Wigner-Araki-Yanase theorem
We are now ready to give two formulations of the WAY theorem which go beyond existing work in several respects. The first formulation is a direct consequence of the quantitative bounds given above.
where Γ E ξ is the channel defined in Eq. (3). If E also fully conserves N , and if either I is repeatable or the Yanase condition is satisfied, then for all x ∈ X it also holds that where Proof. We first prove Eq. (16). By Theorem 3.1, and setting ϵ = 0, if M is a measurement scheme for E then it must hold that If the Yanase condition is satisfied, then Λ * ([Z(x), N A ]) = Λ * (O) = O, and so we obtain Eq. (16). Now let Z be an arbitrary pointer observable, but assume that I is a repeatable E-instrument. Recall that repeatability implies first-kindness, which is a specific instance of non-disturbance. Then by Theorem 3.1, identifying F with E, and setting δ = 0, it must hold that By item (vi) of Proposition E.1, if I is a repeatable measurement of E, then I * X (E(x)A) = I * X (AE(x)) for all A ∈ L(H S ). It follows that I * X ([E(x), N S ]) = O, and so once again we obtain Eq. (16). Indeed, let us note that if the measurement is repeatable, then for any choice of pointer observable Z it will also hold that Λ * (Z(x)B) = Λ * (BZ(x)) for all B ∈ L(H A ), and so Λ * ([Z(x), N A ]) = O even if Z violates the Yanase condition. It follows that we can obtain Eq. (16) under the repeatability assumption directly from the measurability bound. By the same arguments as above, Eq. (17) is obtained from Proposition 3.1.
This theorem goes beyond the original WAY theorem (and its descendants) in the setting of bounded conserved quantities in the following respects: it holds for general interaction channels, unsharp target observables, unsharp pointer observables, and mixed apparatus states. It also provides an operationally motivated quantitative bound from which the original theorem can be obtained as a special case: if E is a sharp observable, the upper bound of Eq. (16) vanishes, in which case an additive conservation law together with either repeatability or the Yanase condition necessitates commutation of E with the system part of the conserved quantity. Note that the impossibility statement of the original WAY theorem holds even under the weaker notion of average conservation. This shows that the impossibility of perfect measurements, for sharp observables not commuting with the conserved quantity, holds in much broader contexts than previously assumed. Indeed, such constraint holds even when the measurement is not constrained by "symmetry"; recall that while full conservation of N by E implies that E is covariant with respect to unitary evolution generated by N (in fact, it is also invariant), it may be the case that E conserves N on average without being covariant. Theorem 3.2 does not rule out accurate (under the Yanase condition) or repeatable measurements for unsharp observables not commuting with the conserved quantity, provided an appropriate apparatus preparation: in the special case of a full conservation law, the apparatus preparation must have a large coherence in the conserved quantity. However, this does not imply that coherence allows for accurate or repeatable measurements of all unsharp observables. We now present a further generalisation of the WAY theorem, providing additional necessary conditions for perfect measurements that are independent of the apparatus preparation.

then for any effect E(x) that has both eigenvalue 1 and 0, it holds that
where P := P 0 (x) + P 1 (x), with P 0 (x) and P 1 (x) orthogonal projections onto the eigenvalue-0 and eigenvalue-1 eigenspaces of E(x), respectively. For a proof, see Appendix (I). The first equality in Eq. (18) follows from the fact that E(x)P = PE(x) = P 1 (x), so that E(x) commutes with P, and the second equality states that while the commutator [E(x), N S ] may not vanish entirely, it does vanish when projected onto the subspace PH S . The above theorem is an even stronger extension of the original WAY theorem, as it relaxes the repeatability condition to that of first-kindness; recall that while a measurement that is repeatable is also of the first kind, repeatability and first-kindness coincide only for sharp observables, and a measurement of an unsharp observable may be of the first kind but not repeatable. Moreover, note that an observable admits a repeatable measurement only if all effects have eigenvalue 1 which, by normalisation, implies that all effects have both eigenvalue 1 and 0. In such a case, Eq. (18) applies to every effect, and P may be interpreted as the projection onto the union of eigenvalue-1 eigenspaces of all the effects of E, i.e., P = x∈X P 1 (x). Finally, note that if E is sharp, then P = 1 S , in which case the original WAY theorem is once again recovered. The condition of an effect E(x) having both eigenvalue 1 and 0 implies that the effect is definite, or admits definite values. Specifically, such a condition implies that there exist states ρ for which outcome x can be predicted to obtain with probabilistic certainty, i.e., tr[E(x)ρ] = 1, and that there exist states σ for which outcome x can be predicted to not obtain with probabilistic certainty, i.e., tr[E(x)σ] = 0. Therefore, a measurement of such an E allows for perfect distinguishability of states ρ and σ; if outcome x is observed, we know with probabilistic certainty that the system was not prepared in state σ. Conversely, if any outcome y ̸ = x is observed, we know with probabilistic certainty that the system was not prepared in state ρ. Theorem 3.3 therefore demonstrates that the impossibility part of the WAY theorem-originally pertaining to sharpness-is more properly understood as concerning observables with definite values, even if unsharp. That is to say, if we wish to achieve perfectly accurate (under the Yanase condition) or first-kind measurements of an observable that truly does not commute with the conserved quantity, i.e., such that [E(x), N S ] does not vanish even when projected onto a subspace of H S , then not only must such an observable be unsharp, but it must also not admit definiteness. Theorem 3.3 imposes stronger constraints than Theorem 3.2 and demonstrates that in general, and irrespective of the apparatus preparation, there exist unsharp observables not commuting with the conserved quantity that do not admit a repeatable or first-kind measurement, and which cannot be accurately measured if the Yanase condition holds. To illustrate this, let us introduce the following model. Consider a system H S ≃ C 3 with the orthonormal basis {|−1⟩, |0⟩, |1⟩}, and the conserved quantity N S = n n|n⟩⟨n| ≡ |1⟩⟨1| − | − 1⟩⟨−1|. Consider also the class of binary observables E λ : where 1/2 < λ ⩽ 1 and |±⟩ := 1 √ 2 (|1⟩ ± |−1⟩). In the absence of any constraints, all observables in this class admit first-kind measurements. For example, since E λ is commutative, then the corresponding Lüders instrument is a first-kind (but not repeatable) measurement of E λ . On the other hand, E λ admits a repeatable measurement if and only if λ = 1, in which case an instrument with operations I ± (·) = tr[E λ (±)·]|±⟩⟨±| is a repeatable measurement of E λ . Since E λ is unsharp, even when λ = 1, then Theorem 3.2 does not rule out accurate or repeatable measurements for such an observable, provided an appropriate apparatus preparation. But now note that when λ = 1, both effects have eigenvalue 1 and 0, and we have P 1 (±) = |±⟩⟨±| and P 0 (±) = |∓⟩⟨∓|, and so P = P 0 (±) + P 1 (±) = |1⟩⟨1| + | − 1⟩⟨−1|. It is easily verified that in such a case, By Theorem 3.3, it follows that when λ = 1, a measurement of E λ that is constrained by an average conservation law cannot be repeatable or even first-kind, and must violate the Yanase condition. However, the effects of E λ when λ < 1 do not commute with the conserved quantity, and have neither eigenvalue 1 nor eigenvalue 0. In such a case, Theorem 3.3 does not rule out accurate or first-kind (but not repeatable) measurements.

The Wigner-Araki-Yanase theorem without the Yanase condition
Traditionally, the Yanase condition is justified by applying the repeatability part of the WAY theorem to the pointer observable; if the pointer observable is sharp, and we consider its measurement as being implemented by a conservative interaction between one measuring apparatus and another, then the pointer observable will admit a repeatable measurement only if it commutes with the conserved quantity. Repeatability of the measurement of the pointer observable is deemed a natural requirement for the possibility of measurement, since an experimenter should be able to confirm the measurement outcome by repeated observations of the apparatus: there must be a stable record of the measurement outcomes. However, such an argument suffers from two drawbacks. Firstly, it applies only to sharp pointer observables. Secondly, it runs into the problem of infinite regress, since we have now shifted the role of the ultimate pointer observable from the first apparatus to the second; repeatability of the first pointer observable can be abandoned if the second admits a repeatable measurement, in which case the experimenter may continue to verify the measurement outcomes. In Appendix (J), we show that the measurability part of the WAY theorem-Theorem 3.2 and Theorem 3.3-can be justified without an appeal to the Yanase condition, but rather by imposing a conservation law on the total measurement process, i.e., including pointer objectification. Such conservation is shown to give rise to the so-called "weak" Yanase is the "Heisenberg-evolved" pointer observable [32]. Subsequently, it is shown that if the weak Yanase condition is satisfied, then Eq. (17), and Eq. (18) will hold.

Fixed points and non-disturbing measurements in the presence of conservation laws
In Sec. 3 we provided general quantitative bounds for measurement disturbance, when observables E and F are measured in succession and when the first measurement is subject to a conservation law. As we saw, these bounds generally do not prohibit non-disturbance for an unsharp F that does not commute with the conserved quantity. In this section, we provide tighter restrictions on the possibility of non-disturbance that depend on the structure of the fixed-point set F(I * X ), which intimately depends on the properties of observable E and the states that are left invariant by the E-channel I X . We first consider the case where F(I * X ) is a von Neumann algebra, which is guaranteed to be the case when F(I X ) contains a faithful state. Next, we relax the faithfulness condition on the states in F(I X ), and obtain similar restrictions for non-disturbance in the finite-dimensional setting. Finally, we show that in the finite-dimensional case, the first-kindness statement of our generalisation of WAY in Theorem 3.3 can be extended to a quantitative bound.

Non-disturbance and von Neumann algebras
By Theorem 3.1, when a sharp observable F is not disturbed, then the upper bound of Eq. (12) vanishes, implying that non-disturbance is possible only if [F(y), N S ] ∈ F(I * X ). But in Appendix (H) we provide a tighter upper bound than that of Eq. (12), which vanishes if both F ⊂ F(I * X ) and F 2 := {F(y) 2 : y ∈ Y} ⊂ F(I * X ) hold, in which case non-disturbance will be possible only if [F(y), N S ] ∈ F(I * X ). While non-disturbance of F trivially implies that F 2 ⊂ F(I * X ) when F is sharp, the implication F ⊂ F(I * X ) =⇒ F 2 ⊂ F(I * X ) holds for all observables whenever F(I * X ) is a von Neumann algebra, which is guaranteed to be the case whenever F(I X ) contains at least one faithful state. We now show that in the presence of conservation laws, if the fixed-point set of the measurement channel I * X is a von Neumann algebra, there are strong constraints imposed on the possibility of non-disturbance.
is a von Neumann algebra, then the following hold: We note that the condition [E, F] = O in item (i) is independent of conservation, and was already shown in Ref. [4]. The proof of the above theorem is given in Appendix (L) (Theorem L.1), and here is a rough sketch for But if the measurement obeys a conservation law, and F ⊂ F(I * X ), then additionally it holds that [F(y), N S ] ∈ F(I * X ), which by the multiplicability theorem implies that [F(y), ) ⊂ E ′ when the fixed-point set of the measurement channel is an algebra, whenever E does not commute with N S then F must also commute with ∆N S ̸ = O, which is not in general guaranteed by commutation of F with N S . Of course, unless E = F, non-disturbance may be possible even if neither E nor F commute with N S . But by items (ii) and (iii) of the above theorem, when E = F, i.e., when I is a first-kind or repeatable measurement of E, then non-disturbance is possible only if E commutes with N S . Indeed, we see that when the fixed-point set of the measurement channel is an algebra, then the constraints on repeatability and first-kindness are much stronger than in the more general case as given by Theorem 3.2 and Theorem 3.3. We may therefore strengthen the necessary conditions for repeatability and first-kindness that are given by the WAY theorem with the following: in the presence of a conservation law, an unsharp and possibly non-commutative observable E not commuting with the conserved quantity admits a repeatable or first-kind measurement only if the E-channel I X perturbs all faithful states. Let us now consider some interesting consequences of the above theorem. As shown by Proposition 6 in Ref. [4], when the system is a qubit, i.e., dim(H S ) = 2, then F(I * X ) is an algebra for any instrument I. See also Corollary M.1 in Appendix (M.2). It follows that for qubits, the implications of Theorem 4.1 will hold in general. Now let us assume that E is a binary observable with the effects where λ ∈ (0, 1] and σ 1 , σ 2 , σ 3 are the Pauli operators. This observable is sharp when λ = 1, and is unsharp when λ < 1. Since binary observables are commutative, then item (ii) of Theorem 4.1 will permit a first-kind measurement of E so long as the conserved quantity commutes with σ 1 . On the other hand, by item (iii) repeatability will be allowed only if λ = 1 also holds. This is not so surprising since repeatability is permitted only when all effects have eigenvalue 1, with such condition being satisfied for qubit observables only when the observable is sharp. Now let us assume that the conserved quantity is N S = σ 3 , so that it does not commute with E, which implies that repeatability and first-kindness will be ruled out. But can a measurement of E not disturb some other observable? Note that [E(±), σ 3 ] = ∓λiσ 2 . By item (i), a non-trivial observable F will be nondisturbed only if it commutes with σ 1 and with σ 2 , which is clearly impossible. If E is commutative, then it holds that F(I L X * ) = E ′ is a von Neumann algebra (as the commutant of a selfadjoint subset of H S ), even in infinite dimensions [76,77]. But recall that the Lüders instrument is a first-kind measurement if it is compatible with a commutative observable. By item (ii) of Theorem 4.1 it follows that in the presence of a conservation law, a commutative E admits a Lüders instrument only if E commutes with the system part of the conserved quantity. Now let us consider an observable E that may be non-commutative, but commutes with N S . Note that in this case, unless dim(H S ) < ∞, then F(I L X * ) is not necessarily an algebra, since in infinite dimensions there exist non-commutative observables E such that F(I L That is, the Lüders measurement of E commuting with N S will fully conserve N S . Recall that in such a case, non-disturbance will not be ruled out for any observable F that commutes with E (see Corollary H.2). Indeed, it will hold that ∆N S = O, and so any F will trivially commute with ∆N S . But does an observable E commuting with N S always admit a Lüders measurement in the presence of a conservation law? We shall now show that in the presence of a full conservation law, and where the apparatus part of the conserved quantity is highly non-degenerate, such measurements will require a large coherence in the apparatus preparation. This surprising observation can be seen as a "converse" WAY theorem.
and that I X fully conserves N S . Define the subspace of H A that is involved during the measurement process as where Λ is the conjugate channel to I X defined in Eq. (5).
Moreover, let us note that the Lüders Einstrument is extremal whenever the effects of E are linearly independent [80]. The proof of the above proposition is provided in Appendix (K), and here we present a rough sketch. In the case that E fully conserves N and I X fully conserves N S then both the expected value and the variance of N A must not change as a result of the measurement interaction. That is, It follows that if ξ is an eigenstate of N A , i.e., if ξ has support only in a single degenerate eigenspace of N A , then Λ(ρ) must live in the same eigenspace for all ρ. That is, N A must be "effectively" fully degenerate, in the sense that H A (meas) must be contained within a single degenerate eigenspace of N A . This generalises an observation made in Ref. [81], which held only in the case of Lüders measurements of sharp observables, implemented by normal measurement schemes satisfying the Yanase condition. But in many physically relevant situations N A will not be (effectively) fully degenerate-for example, the apparatus may be a system with a conserved quantity N A that is completely non-degenerate. In such cases, when the interaction between system and apparatus obeys a full conservation law, an instrument that fully conserves N S can be implemented only if the apparatus preparation is not an eigenstate of N A , which implies that Var (N A , ξ) must be large. Finally, if the instrument is extremal, and N A is not (effectively) fully degenerate, then for every pure state decomposition ξ = i q i P ϕi , the uncertainty of N A in ϕ i must be large, which implies that the apparatus preparation must have a large coherence as quantified by the quantum Fisher information.

Non-disturbance and operator spaces
Due to the Schauder-Tychonoff fixed point theorem [82], all E-channels I X have at least one fixed state. However, it may be that none of these are faithful. In such a case, the fixed-point set of the dual channel I * X is not necessarily a von Neumann algebra, but rather forms an operator space [83]. This setting has been much less investigated, and in Appendix (M.1) we provide some novel analysis of the structure of such fixed-point sets. While the discussion thus far has been applicable for infinite-dimensional systems-except in some examples-in this section we shall always assume that dim(H S ) < ∞. We define the minimal support projection P on the fixed-point set F(I X ) as In other words, for all projections Q and fixed states ρ ∈ F(I X ) such that ρ = QρQ, it holds that Q ⩾ P . Note that P = 1 S if and only if F(I X ) contains a faithful state, in which case F(I * X ) is an algebra, so that we recover the results of Theorem 4.1. We now provide a generalisation of this result which accounts for situations where P may be smaller than the identity, i.e., where the E-channel I X may perturb all faithful states. Here, we define P EP := {P E(x)P : x ∈ X } and P FP := {P F(y)P : y ∈ Y} as restrictions of observables E and F in P H S . Theorem 4.2. Let E := {E(x) : x ∈ X } and F := {F(y) : y ∈ Y} be observables acting in H S . Let M := (H A , ξ, E, Z) be a measurement scheme for an E-instrument I, and assume that E conserves an additive quantity If P is the minimal support projection on F(I X ), then the following hold: The proof is provided in Appendix (M.2) (Theorem M.1) and it follows from similar arguments as those used in Theorem 4.1. That is, by noting that there exists a faithful fixed state in the subspace P H S , we observe that the projection of the fixed-point set F(I * X ) onto the subspace P H S is a von Neumann algebra. As a simple example, let us consider the case where E is measured by a nuclear instrument. The operations of a nuclear instrument I are written as I x (·) = tr[E(x)·]σ x , where {σ x } is a family of states. It is simple to verify that in such a case, P is the minimal projection on ∪ x supp(σ x ). Additionally, if E is a norm-1 observable, and for every x the support of σ x is contained within the eigenvalue-1 eigenspace of E(x), then such an instrument will be a repeatable measurement of E. Every observable admits a nuclear instrument and, as shown in Corollary 1 of Ref. [4], every instrument compatible with a rank-1 observable is nuclear. Now assume that I does not disturb some observable F. Since the dual E-channel may be written as I * That is, non-disturbance is possible only if F is a classical post-processing of E. But note that unless E is commutative, this does not generally imply that F must commute with E. However, item (i) of Theorem 4.2 states that P FP must commute with P EP and with P ∆N S P . Given that P it follows that only observables F that commute with both the measured observable E, and with the system part of the conserved quantity N S , will be non-disturbed. While the implications of the above theorem depend on the minimal support projection P on the fixed states of the measurement channel I X , and hence on the specific measurement implementation, we may use the structure of the fixed-point set to obtain necessary conditions for first-kindness that depend only on the measured observable. In Appendix (M.3), we provide some necessary conditions for non-disturbance that are independent of conservation laws, showing that non-disturbance is intimately related to distinguishability. In particular, we show that if I is a first-kind measurement of E, then this observable must be a classical post-processing of a norm-1 observable G ⊂ F(I * X ), and that there exists a family of states {ρ z } that are perfectly distinguishable by a measurement of G such that {I X (ρ z )} remain perfectly distinguishable. Next, we use these results to obtain a quantitative form of the WAY theorem for first-kindness, presented below. Theorem 4.3. Consider a measurement scheme M := (H A , ξ, E, Z) for a nontrivial observable E with the instrument I acting in H S . Assume that I is a measurement of the first kind, and that E conserves an additive quantity N = N S ⊗ 1 A + 1 S ⊗ N A on average, where N S ∈ L s (H S ) and N A ∈ L s (H A ). For each outcome x associated with a non-trivial effect E(x), let K max (x) and K min (x) be subspaces of H S defined by K max (x) and K min (x) are orthogonal, and for all unit vectors ψ ∈ K max (x) and ϕ ∈ K min (x), it holds that  4). This raises an interesting question: will Eq. (21) also hold if we abandon the requirement of first-kindness, and instead assume that the measurement satisfies the Yanase condition? This question is beyond the scope of the present paper, but the answer may highlight to what extent the necessary conditions for measurability and non-disturbance will continue to satisfy the "symmetry" witnessed so far in WAY-type theorems. To demonstrate that Theorem 4.3 provides much stronger constraints than Theorem 3.3, let us consider again the simple model of a binary observable E λ := {E λ (+), E λ (−)} acting in H S ≃ C 3 introduced surrounding Eq. (19). Recall that Theorem 3.3 did not rule out first-kind measurements of E λ for any 1/2 < λ < 1. But now note that K max (±) = span{|±⟩}, K min (±) = span{|∓⟩}, ∥E λ (±)∥ = ∥1 S − E λ (±)∥ = λ, ∥N S ∥ = 1, and |⟨±|N S |∓⟩| = 1. By Theorem 4.3, it follows that such an observable admits a first-kind measurement only if which cannot be satisfied for any 1/2 < λ < 1; indeed, the above inequality is satisfied only if λ = 1/2, in which case E λ (±) = 1 S /2 are trivial effects.

Conclusions
We have provided a number of general and operational bounds which capture measurement error and disturbance, with emphasis on the setting in which there is a conservation law-both "full" conservation, and the weaker notion of "average" conservation. We obtained new, quantitative versions of the WAY theorem, which generalise previous work in several respects, going beyond normal measurement schemes, and not assuming that the observable to be measured is sharp. The work presented surrounding the WAY theorem was also studied in the novel setting of sequential measurements for general pairs of observables, and the quantitative bounds were further refined by the analysis of the fixed point structure of the measurement channel in settings which have received scant attention. We saw that the large apparatus coherence played a key role for measurability and non-disturbance in the presence of a full conservation law, pointing to the requirement of "large" apparatus. This points further to possible deep connections between the WAY theorem and the rapidly developing theory of quantum reference frames, analysed so far only when the conserved quantity has a conjugate phase [42,41]. While necessary, however, the large apparatus coherence was shown to not be sufficient for good measurements; we saw that conservation laws impose strict constraints on the error or disturbance for unsharp observables that admit definite values. Our work suffers from the drawback that many physically arising conserved quantities are unbounded. Very recently, the measurability part of the WAY theorem for sharp target observables was proven in the setting of unbounded conserved quantities, where the conservation law is stated as the invariance of the unitary group generated by the conserved quantity under the action of the measurement interaction [84]. The measurability question for unsharp target observables, as well as the question of disturbance, should also be systematically studied when the conserved quantity is unbounded. This is a technically challenging endeavour and we save it for future work.

A Properties of operations
Operations allow for the construction of an "operator-valued inner product", which will be frequently used in this paper. For an operation Φ * : L(K) → L(H), we define the sesquilinear mapping ⟨⟨·|·⟩⟩ : L(K) × L(K) → L(H) by to hold for all A, B ∈ L(K). The following lemma shows that such a map mimics several important properties of an inner product. Proof. (i) trivially follows from linearity of operations, while (ii) follows from the fact that an operation preserves the involution, i.e., Φ * (A) * = Φ * (A * ). (iii) follows from Kadison's inequality, or the two-positivity of CP maps [85,86]. To show this, note that by Stinespring's dilation theorem [87] we may write Φ * (22) we may therefore write where π = π * := That ⟨⟨A|A⟩⟩ ⩾ O trivially follows. Finally, we prove the Cauchy-Schwarz inequality which, for the case of channels, was proven by Janssens in Lemma 1 of Ref. [53]. The proof for the case of general operations is identical; by Eq. (23) we may write In the second line we have used the fact that for any self-adjoint operator A ∈ L s (H), it holds that B * AB ⩽ ∥A∥B * B for all B ∈ L(H), while in the third line we have used the C* identity ∥AA * ∥ = ∥A * A∥ for all A ∈ L(H).
Note that the sesquilinear mapping in Eq.
Inserting the above inequalities in Eq. (25) gives the bound in Eq. (24).
Finally, we present the following useful properties of operations: Lemma A.2. Let Φ * : L(K) → L(H) be an operation. For any effects A ∈ L p (K) and B ∈ L p (H), it holds that Proof. This inequality (for channels) was given as Eq. (4) in Ref. [88]; the proof below follows Theorem 2 of Ref. [89]. Let us first define C := Φ * (A) − B for notational simplicity. Now, given that and so we have the bound In the third line we use the fact that A and B are effects which, given that Φ Proof. First, let us note that for any B ∈ L p (K), we have and so Φ * (ABA) = O. By the two-positivity of CP maps, it follows that for any B ∈ L(K) we have The claim immediately follows. If F(Φ * ) is a von Neumann algebra, it holds that for any self-adjoint operator A ∈ F(Φ * ), the spectral measure of A is also contained in F(Φ * ). In the case that A has a discrete spectrum, i.e., A = n λ n P n , this implies that {P n } ⊂ F(Φ * ).

C Fixed points of instrument channels
Here we prove a useful result regarding the fixed-point structure of the E-channel I * X , describing a non-selective measurement of an observable E, which we shall use in several places in this paper. Lemma C.1. Let I be an instrument compatible with an observable E acting in H S . The following hold: Since Z(x) are effects, it follows that We thus obtain from Eq. (26) the bound Now we may prove (i). If E is sharp, then the upper bound of Eq. (27) vanishes and so for all A ∈ L(H S ), I * X (A) ∈ E ′ . As such, I * X (L(H S )) ⊂ E ′ . That F(I * X ) ⊂ I * X (L(H S )) is trivial. Now we prove (ii). Assume that A ∈ F(I * X ), which implies that A * ∈ F(I * X ). But if F(I * X ) is a von Neumann algebra, this implies that A * A, AA * ∈ F(I * X ), and so the upper bound of Eq. (27) vanishes. Consequently, we see that for all A ∈ L(H S ), A ∈ F(I * X ) =⇒ A ∈ E ′ , which implies that F(I * X ) ⊂ E ′ . Finally, let us prove (iii). Let A be a self-adjoint operator, and assume that I X fully conserves A. By Definition 2 it holds that I * X (A k ) = A k for k = 1, 2, and so once again the upper bound of Eq. (27) vanishes, implying that A ∈ E ′ .

D Disturbance, commutation, and compatibility
The pair of observables E := {E(x) : x ∈ X } and F := {F(y) : y ∈ Y} acting in H S are compatible, or jointly measurable, if they admit a joint observable G := {G(x, y) : (x, y) ∈ X × Y} so that If E and F do not admit a joint observable, then they are incompatible [21]. Now let I be an E-compatible instrument, and assume that F ⊂ F(I * X ). In such a case, we may choose G as G(x, y) = I * x (F(y)), which satisfies Eq. (28). It follows that non-disturbance implies compatibility, and so for two incompatible observables E and F, no E-instrument I exists that satisfies F ⊂ F(I * X ). Note that while non-disturbance requires compatibility, compatibility does not guarantee non-disturbance. For instance, while any observable is compatible with itself, for every informationally complete observable the fixed-point set of its compatible channel is trivial. Indeed, the size of the fixed-point set of an E-channel is strongly related to the amount of information given by E as shown in Ref. [16]. Furthermore, as shown in Ref. [4], there exist pairs of compatible observables E and F where E admits an instrument that does not disturb F, but all possible F-instruments necessarily disturb E. This further demonstrates that unlike compatibility, non-disturbance is not symmetric. As shown in Ref. [88], the pair of observables E and F are compatible only if Commutation is a sufficient condition for compatibility; if E commutes with F, then there is a joint observable G with effects G(x, y) = E(x)F(y) ≡ ( E(x) F(y)) * ( E(x) F(y)). On the other hand, if either E or F is sharp, in which case the upper bound of Eq. (29) vanishes, then commutation is a necessary condition for compatibility [91]. For two non-commuting observables to be compatible, therefore, their effects must be sufficiently unsharp. We now provide a bound for the disturbance of F by an E-instrument I, in terms of the commutation between the effects of E and F.

Proposition D.1. Consider the observables E and F acting in H S , and let ∥δ(y)∥ be the disturbance of the effects of F caused by an E-instrument I. Then for all x ∈ X and y ∈ Y it holds that
If F is non-disturbed by I, that is, if δ = 0, then for all x ∈ X and y ∈ Y it holds that Proof. By Eq.
We see that when E commutes with F the lower bound of Eq. For all x ∈ X and y ∈ Y, it also holds that Proof. Since F(y) are effects and I * X is a channel, then by Lemma A.2 we have ∥I * X (F(y) 2 ) − I * X (F(y)) 2 ∥ ⩽ 2∥δ(y)∥ + ∥F(y) − F(y) 2 ∥. As such, Eq. (34) is obtained directly from Eq. (30).
Note that while Corollary D.1 provides a lower bound for the disturbance, which is strictly positive whenever either E or F is sharp and these observables do not commute, such a lower bound will differ depending on whether E or F is sharp; if E is sharp, we have δ ⩾ max x,y ∥[E(x), F(y)]∥, whereas if F is sharp but E is unsharp, the lower bound for the disturbance may be smaller. Let us illustrate this with the following example. for each a, which is smaller than λ 2 for 0 < λ < 1. Let us now consider the case of non-disturbance more carefully. First, let us note that when we set ∥δ(y)∥ = 0, Eq. (34) reduces to the compatibility bound of Eq. (29), and states that for non-disturbance to be possible when E and F do not commute, then both observables must be sufficiently unsharp so as to be compatible. To be sure, compatibility is a necessary condition for non-disturbance, and the fact that Eq. (34) does not contradict the compatibility bound is not surprising. On the other hand, in the case of non-disturbance this bound is also not very informative-it is possible for two observables to be compatible, while a measurement of one still disturbs the other. To gain a better understanding of non-disturbance, let us consider instead Eq. (31), the upper bound of which is smaller than the upper bound in Eq. (34) when we set ∥δ(y)∥ = 0, and vanishes if both F ⊂ F(I * X ) and F 2 := {F(y) 2 : y ∈ Y} ⊂ F(I * X ) hold. We immediately see that while unsharpness of both E and F is necessary for non-disturbance when E and F do not commute, it is not sufficient; as shown in Ref. [4] there are at least two classes of unsharp observables F where given any instrument I, i.e., including instruments that measure an unsharp observable E that does not commute with F but is still compatible with F, it holds that F ⊂ F(I * X ) guarantees F 2 ⊂ F(I * X ): if F is a rank-1 observable, or if F is an "informationally equivalent coarse-graining" of a sharp observable. Let us consider the first option. If F is a rank-1 observable, then all the effects of F may be written as F(y) = λ y P y , where P y is a rank-1 projection operator and λ y ∈ (0, 1]. As shown in [92], all observables E that are compatible with a rank-1 observable F are the post-processings of F, that is, the effects of E may be written as E(x) = y p(x|y)F(y), where {p(x|y)} is a family of non-negative numbers satisfying x p(x|y) = 1 for all y. It follows that so long as F is a non-commutative rank-1 observable, then there exists an unsharp observable E that is compatible with F but does not commute with F. But note that I * X (F(y)) = F(y) if and only if I * X (P y ) = P y . As such, I * X (F(y) 2 ) = λ 2 y I * X (P y ) = λ 2 y P y = F(y) 2 . It follows that F will be non-disturbed by an E-compatible instrument I only if E commutes with F. Let us now consider the second option. We say that F is an informationally equivalent coarse-graining of a sharp observable G := {G(z) : z ∈ Z} if there exists an invertible stochastic matrix M such that F and G are informationally equivalent because a measurement of F produces different probability distributions for two states ρ 1 and ρ 2 if and only if these states produce different probability distributions given a measurement of G. Since G is sharp, then F(y) 2 = z M 2 y,z G(z). Now assume that F ⊂ F(I * X ). It is simple to verify that this implies G ⊂ F(I * X ). Therefore, we have I * X (F(y) 2 ) = z M 2 y,z I * X (G(z)) = z M 2 y,z G(z) = F(y) 2 . Once again, F will be non-disturbed by an E-compatible instrument I only if E commutes with F. Both of the above examples offer a very simple interpretation in terms of compatibility. If F is a rank-1 observable, then non-disturbance of F implies non-disturbance of sharp rank-1 effects P y . Since non-disturbance requires compatibility, this implies that E must commute with all P y , and hence with F. On the other hand, if F is a classical coarse-graining of a sharp observable G, then non-disturbance of F implies non-disturbance of G, and by compatibility E must commute with G. Since the effects of F are constructed as a mixture of the (projective) effects of G, this concludes that E must commute with F.

E Properties of repeatable instruments
In this section, we prove a series of useful results regarding the structure of repeatable instruments, and the measurement schemes that implement them. Proposition E.1. Let M := (H A , ξ, E, Z) be a measurement scheme for an E-compatible instrument I acting in H S . If I is repeatable, then the following hold: (i) For all x ∈ X and n ∈ N, it holds that For all x ∈ X , it holds that E(x) and Z(x) have 1 as an eigenvalue, and so there exist projection operators P(x) ∈ L p (H S ) and Q(x) ∈ L p (H A ) which project onto the eigenvalue-1 eigenspaces of E(x) and Z(x), respectively. (iii) For all x, y ∈ X , it holds that P(x)E(y) = P(x)P(y) = δ x,y P(x) and Q(x)Z(y) = Q(x)Q(y) = δ x,y Q(x). Proof. (i): The repeatability condition implies that for all x ∈ X , it holds that ). It follows that for any state ρ ∈ S(H S ), we have Here, the second line follows from the Cauchy-Schwarz inequality, the third line follows from the fact that E(x) and Z(x) are effects and so E(x) 2 ⩽ E(x) and Z(x) 2 ⩽ Z(x), and the final line follows from the fact that repeatability implies first-kindness and that M is a measurement scheme for E. As the second inequality must be an equality, we thus have To show that the relations hold for all n ∈ N, it suffices to show that for all ρ, the Cauchy-Schwarz inequality and the above arguments implies The claims are thus obtained by induction. (ii): Note that for any operation Φ * : L(K) → L(H), it holds that ∥Φ * (A)∥ ⩽ ∥A∥ for all A ∈ L(K). As such, by is an effect it also holds that ∥E(x)∥ ⩾ ∥E(x)∥ 2 . It follows that ∥E(x)∥ is either zero or one. As we assume that E(x) is not vanishing, then ∥E(x)∥ = 1 follows. Similarly, we have 1 = ∥E(x)∥ = ∥Γ E ξ (1 S ⊗ Z(x))∥ ⩽ ∥Z(x)∥, and since Z(x) is an effect, then it must hold that ∥Z(x)∥ = 1. Now we shall show that E(x) has 1 as an eigenvalue, i.e., there exists a unit-vector ψ ∈ H S such that E(x)ψ = ψ. If this is not so, then we would have lim n→∞ E(x) n = O, which would contradict (i). Therefore, there exists a projection operator P(x) that projects onto the eigenvalue-1 eigenspace of E(x). Similar arguments hold for Z(x) and Q(x). (iii): For each x, define P c (x) := E(x) − P(x). Since E(x) is an effect and P(x) projects onto the eigenvalue-1 eigenspace of E(x), it trivially holds that ψ ∈ supp(P(x)) =⇒ ψ ∈ ker(P c (x)). Now, given that ψ ∈ supp(P(x)) implies that P(x)ψ = ψ, and denoting the null vector in H S as ∅, we have By positivity of E(y), the above equation implies that which can be satisfied only if E(y)ψ = ∅ =⇒ E(y)ψ = ∅ for all y ̸ = x. We thus have ψ ∈ supp(P(x)) =⇒ ψ ∈ ker(E(y)) ∀ y ̸ = x, and so the support of P(x) must be orthogonal to the support of E(y) for all y ̸ = x. That P(x) and P(y) for x ̸ = y have orthogonal supports follows trivially. Similar arguments hold for Q(x), Z(y), and Q(y). (iv): It trivially holds that ∥P c (x)∥ < 1 and ∥Q c (x)∥ < 1. We thus have lim n→∞ P c (x) n = O and lim n→∞ Q c (x) n = O. As stated in (iii), the supports of P(x) and P c (x) are orthogonal, and so it holds that P(x)P c (x) = P c (x)P(x) = O. As such, for all n ∈ N we have E( In the final line, we have used the fact that E(x) = P(x) + P c (x) and Z(x) = Q(x) + Q c (x), together with (iv). (vi): We may write The first line follows from (v), and the third line follows from the definition Q(x) ⊥ := 1 S − Q(x). The final line is obtained by (v) and noting that ( The relation I * X (AE(x)) = I * X (P(x)AP(x)) holding for all A trivially follows from above and by observing that I * X (E(x)A * ) * = I * X (AE(x)) and I * X (P(x)A * P(x)) * = I * X (P(x)AP(x)). Similarly, we may write (vii): We may write In the first line we have used (v), in the third line we use Q(x) = Z(x) − Q c (x), and in the final line we use (iv).
Let us highlight one interesting property of repeatable instruments: if I is repeatable, then for all input states ρ, the output states will be perfectly distinguishable.

F Conservation laws
Recall that by Definition 1 a channel Φ conserves N on average if Φ * (N ) = N , while by Definition 2 Φ fully conserves N if Φ * (N k ) = N k for all k ∈ N. We shall now show that full conservation is in fact equivalent to just the first two moments being conserved, and that is is also equivalent to "invariance" of the unitary group generated by N under the action of Φ * , i.e., that Φ * (e itN ) = e itN for all t ∈ R. Proposition F.1. Let Φ : T (H) → T (H) be a channel, and let N ∈ L s (H) be a self-adjoint operator. The following statements are equivalent: Proof. Since N is bounded, e itN is bounded and strongly continuous, and Φ * is a channel, then f (t) is infinitely differentiable. Now assume that Φ * (N k ) = N k for k = 1, 2. It follows that where the second equality follows from Corollary A.1. Indeed, by induction we obtain (d k /dt k )f (t) = i k N k f (t) for all k ∈ N. Since f (0) = 1, by Taylor expansion around t = 0 we observe that (iii) =⇒ (i): Assume that Φ * (e itN ) = e itN for all t ∈ R. It follows that holds for all k ∈ N and all t. Since e itN = 1 when t = 0, it follows that Φ * (N k ) = N k for all k.
A property that channels may enjoy is "covariance" under the action of a unitary group, i.e., that Φ * (e itN Ae −itN ) = e itN Φ * (A)e −itN holds for all t ∈ R and A ∈ L(H). While a channel may be coavariant while not invariant-for example, Φ * (·) = tr[·ω]1 such that [ω, N ] = O is covariant but not invariant-we now show that invariance implies covariance.
Proof. Let us define V (t) := e itN for notational simplicity. By Proposition F.1, full conservation of N by Φ implies that Φ * (V (t)) = V (t) for all t. In particular, noting that V (t) * = V (−t), this implies that We note that the condition Φ * (N k ) = N k for k = 1, 2 was taken as a potential definition of conservation simpliciter in Ref. [93]. However, the authors here conjectured that, in finite dimensions, the condition Φ * (N 2 ) = N 2 may be dropped, and that (in our formulation) both average and full conservation are equivalent.
We shall now address this issue: by a simple counter-example, we shall show that average and full conservation are in fact not equivalent for general channels, even in finite dimensions. Let us consider a system H ≃ C 3 with an orthonormal basis {| − 1⟩, |0⟩, |1⟩}. Now consider N = n n|n⟩⟨n| ≡ |1⟩⟨1| − | − 1⟩⟨−1|, and a channel Φ * defined by to hold for all A ∈ L(H), where we define |+⟩ := 1 √ 2 (|1⟩ + | − 1⟩). It is simple to verify that Φ * (N ) = N , that is, Φ conserves N on average. However, Φ * (N 2 ) = 1 ̸ = N 2 , and so Φ does not fully conserve N . Since full conservation is equivalent to invariance, then it follows that Φ is also not invariant. Indeed, we can easily verify that Φ is not covariant either; for example, if we choose A = |+⟩⟨+|, then it holds that which coincide only when t is an integer multiple of π. While the above discussion shows that average conservation is in general a weaker condition than full conservation, we shall now show that in the special case of unitary channels, average and full conservation are equivalent:

G Bounds for measurement error under conservation laws
Here we provide quantitative trade-off relations for measurement error under additive conservation laws, both average and full.
Theorem G.1. Let M := (H A , ξ, E, Z) be a measurement scheme for an observable acting in H S , and assume that E conserves an additive quantity N = N S ⊗1 A +1 S ⊗N A on average, where N S ∈ L s (H S ) and N A ∈ L s (H A ). Let ∥ϵ(x)∥ be the error in measuring the effects of the target observable E, as defined in Eq. (7). Then for all x ∈ X it holds that where Γ E ξ is the channel defined in Eq. (3), and Λ is the conjugate channel to I X defined in Eq. (5).
Proof. By Eq. (7), we have , and so we may write [Λ * (Z(x)), . We may therefore write By the sesquilinear mapping We thus obtain the bound given in Eq. (35).

Proposition G.1. Let M := (H A , ξ, E, Z) be a measurement scheme for an observable acting in H S , and assume that E fully conserves an additive quantity
Let ∥ϵ(x)∥ be the error in measuring the effects of the target observable E, as defined in Eq. (7). Then for all x ∈ X it holds that where Q(N A , ξ) denotes the quantum Fisher information of N A in the state ξ. Additionally, if E is an extremal observable and M is a measurement scheme for E, then for all x ∈ X it holds that Proof. Let {q i , ϕ i } be an arbitrary ensemble of unit vectors that satisfies ξ = i q i P ϕi . We may thus write . Given the additivity of N and the conservation law, we may rewrite Eq. (36) as

By the sesquilinear mappings ⟨⟨A|B⟩⟩
Since both Z(x) and Γ E ϕi (1 S ⊗ Z(x)) are effects, we obtain We thus arrive at the bound where the second line follows from the concavity of the square root. By choosing the ensemble {q i , ϕ i } that gives the quantum Fisher information as in Eq. (13), we arrive at the bound in Eq. (37). Now assume that E is an extremal observable [94]. This implies that for any pair of observables E (1) and E (2) , and any λ ∈ (0, 1), the effects of E can be decomposed as Once again choosing the ensemble that gives the quantum Fisher information, we arrive at Eq. (38).

H Bounds for measurement disturbance under conservation laws
Here we provide quantitative trade-off relations for measurement disturbance under additive conservation laws, both average and full. Note that here, the observable that may or may not be disturbed is not necessarily the same observable that is measured by the instrument I. Theorem H.1. Let M := (H A , ξ, E, Z) be a measurement scheme for an instrument I acting in H S , and assume that E conserves an additive quantity N = N S ⊗ 1 A + 1 S ⊗ N A on average, where N S ∈ L s (H S ) and N A ∈ L s (H A ). Let ∥δ(y)∥ be the disturbance of the effects of an observable F := {F(y) : y ∈ Y} caused by I, as defined in Eq. (8). Then for all y ∈ Y it holds that Proof. Since F(y) are effects and I * X is a channel, then by Lemma A.2 we have ∥I * X (F(y) 2 ) − I * X (F(y)) 2 ∥ ⩽ 2∥δ(y)∥ + ∥F(y) − F(y) 2 ∥. The claim immediately follows from Eq. (39).
where Q(N A , ξ) denotes the quantum Fisher information of N A in the state ξ. Additionally, if I is an extremal instrument, then for all y ∈ Y it holds that Proof. Let {q i , ϕ i } be an arbitrary ensemble of unit vectors that satisfies ξ = i q i P ϕi . We may thus write . By the conservation law and additivity of N , we may therefore rewrite Eq. (41) as which, by the sesquilinear mappings ⟨⟨A|B⟩⟩ i : Since both F(y) and Γ E ϕi (F(y) ⊗ 1 A ) are effects, we have We thus arrive at the bound where the second line follows from the concavity of the square root. By choosing the ensemble {q i , ϕ i } that gives the quantum Fisher information as in Eq. (13), we arrive at Eq. (43). Now assume that I is an extremal instrument [59]. This implies that for any pair of instruments I (1) and I (2) , and any λ ∈ (0, 1), the operations of I can be decomposed as I x (·) = λ I (1) x (·) only if I = I (1) = I (2) . It holds that Γ E ϕi (· ⊗ 1 A ) = I * X (·) for all i, and so we obtain Once again, by choosing the ensemble that gives the quantum Fisher information, we arrive at Eq. (44).

I Proof for Generalised WAY theorem 2
Here, we shall provide a detailed proof for Theorem 3.3 presented in the main text. Theorem I.1 (Generalised WAY theorem 2). Let M := (H A , ξ, E, Z) be a measurement scheme for an Einstrument I acting in H S , and assume that E conserves an additive quantity N = N S ⊗ 1 A + 1 S ⊗ N A on average, where N S ∈ L s (H S ) and N A ∈ L s (H A ). If either I is a measurement of the first kind, or the Yanase condition [Z, N A ] = O is satisfied, then for any effect E(x) that has both eigenvalue 1 and 0, it holds that where P := P 0 (x) + P 1 (x), with P 0 (x) and P 1 (x) orthogonal projections onto the eigenvalue-0 and eigenvalue-1 eigenspaces of E(x), respectively.
Proof. Let us first note that E(x)P = PE(x) = P 1 (x), and E(x) Indeed, P 0 (x) may equivalently be considered as the projection onto the eigenvalue-1 eigenspace of E(x) ⊥ . Now define the operation Γ E ξ,P : , it follows that ∥Z(x)∥ = 1. Therefore, by the two-positivity of CP maps, and the relation A * BA ⩽ ∥B∥A * A for self-adjoint B, we observe that and so Now note that by additivity of N , and the conservation law, it holds that Γ E ξ,P (N ) = PN S P + tr[N A ξ]P. If the Yanase condition holds, we may write The third line follows from Eq. (46) and the multiplicability theorem (Corollary A.1), and the final line follows from the Yanase condition. As such, we arrive at Eq. (45). Now let us abandon the Yanase condition, but instead assume that I is a first-kind measurement for E. This implies that Γ E ξ,P (E(x)⊗1 A ) = P 1 (x). Since ∥E(x)∥ = 1, then by the two-positivity of CP maps, and the relation A * BA ⩽ ∥B∥A * A for self-adjoint B, we obtain and so By the same arguments as in item (i) of Proposition E.1, one can show from Eq. (46) and Eq. (47) that Consequently, by the same arguments as in item (iv) of Proposition E.1, it follows that Γ E ξ, is the projection onto the eigenvalue-1 eigenspace of Z(x), and Γ E ξ,P ((E(x) − P 1 (x)) ⊗ 1 A ) = O. Moreover, by Eq. (46) , Eq. (47), the multiplicability theorem (Corollary A.1), and defining Z(x) ⊥ := 1 A − Z(x), it follows that and so by the same arguments as in items (v) and (vi) of Proposition E.1 it follows that for all A ∈ L(H S ). By additivity of N , and the conservation law, we may therefore write The third line follows from Eq. (47) and Corollary A.1, while the final line follows from Eq. (48). Once again we arrive at Eq. (45).

J The Weak Yanase condition from conservation laws
Thus far, we have only considered the case where the measurement interaction E between system and apparatus conserves an additive quantity N . However, pointer objectification will also result in state changes, and it may be the case that the expected value of N will change as a result. Now let us provide a generalised prescription of measurement schemes that captures also the state changes due to pointer objectification. Recall It is straightforward to show that this is satisfied if J is compatible with the "Heisenberg-evolved" pointer observable We say thatM obeys a full (average) conservation law if the channel J X fully (on average) conserves a quantity N . The operations J x can be constructed as a sequential application of the channel E followed by the operations of some Z-compatible instrument acting in H A , the latter of which provides a physical characterisation of the pointer objectification process. In such a case, a sufficient condition for conservation of N by J X is the conservation of N by both E and the Z-channel. But it may be the case that E fully conserves N while the Z-channel conserves N only on average, and vice versa. In such cases, the channel J X will conserve N only on average. By Lemma C.1, it holds that if J X conserves N on average, and if either Z τ is sharp or if J X also fully conserves N , then This commutation relation is known as the weak Yanase condition [32]. We note that if E conserves N on average and if either Z τ is sharp or if E also fully conserves N , then the Yanase condition implies the weak Yanase condition. First, let us assume that Z τ is sharp. Since Z(x) is an effect then by two-positivity of CP maps we have , and so we have On the other hand, if E fully conserves N then E * (N 2 ) = E * (N ) 2 = N 2 . In either case, by Corollary A.1 we have However, in general it may be the case that the weak Yanase condition is satisfied but the Yanase condition is violated.
The following proposition shows that if the weak Yanase condition is satisfied, then the measurability part of the WAY theorem will hold. Moreover, we see that there are cases where a large coherence of the conserved quantity in the apparatus is necessary for good measurements even without a full conservation law-for example, if either the interaction channel E or the Z-channel conserves N only on average, but Z τ is sharp, in which case the weak Yanase condition is guaranteed to hold.  (49) and . Then for all x ∈ X it holds that and where Q(N A , ξ) is the the quantum Fisher information of N A in ξ as defined in Eq. (13). Additionally, if M is a measurement scheme for E, then for any effect E(x) that has both eigenvalue 1 and 0, it holds that where P = P 0 (x) + P 1 (x), with P 0 (x) and P 1 (x) orthogonal projections onto the eigenvalue-0 and eigenvalue-1 eigenspaces of E(x), respectively.

K Proof of the "converse" WAY theorem
Here, we provide a proof for Proposition 4.1 presented in the main text. Proposition K.1. Let M := (H A , ξ, E, Z) be a measurement scheme for an instrument I acting in H S . Assume that E fully conserves an additive quantity N = N S ⊗ 1 A + 1 S ⊗ N A , where N S ∈ L s (H S ) and N A ∈ L s (H A ), and that I X fully conserves N S . Define the eigenspace of H A that is involved during the measurement process as where Λ is the conjugate channel to I X defined in Eq. (5). Proof. Let us first observe that by Eq. (10), if E conserves N on average, then N S ∈ F(I * X ) implies that irrespective of the apparatus preparation ξ, it holds that tr[N A Λ(ρ)] = tr[N A ξ] for all ρ ∈ S(H S ) or, equivalently, that Λ * (N A ) = tr[N A ξ]1 S . That is, the expected value of the apparatus part of the conserved quantity does not change as a result of the measurement interaction. While average conservation does not imply that the variance must also stay the same, this implication can be shown to follow in the case of full conservation. By Definition 2, full conservation of N by E implies that Γ ξ (N 2 ) = Γ E ξ (N 2 ), where Γ E ξ is the channel defined in Eq. (3). Given that Γ E ξ (· ⊗ 1 A ) = I * X (·) and Γ E ξ (1 S ⊗ ·) = Λ * (·), we thus obtain In the second line, we have used the fact that full conservation of N S by I , and the multiplicability theorem (Corollary A.1). It follows that Λ * (N k A ) = tr[N k A ξ]1 S for k = 1, 2. As such, for any input state ρ of the system to be measured, it holds that Now assume that ξ is an eigenstate of N A , i.e., that there exists c ∈ R such that N A ξ = c ξ. Since such a condition is equivalent to a vanishing variance, then we see that Var (N A , Λ(ρ)) = Var (N A , ξ) = 0, and so Λ(ρ) must also be eigenstates of N A with the same eigenvalue c. In fact, in such a case it holds that = c k for all k ∈ N and ρ ∈ S(H S ). In other words, if ξ is an eigenstate of N A , then N A must be "effectively" fully degenerate, i.e., H A (meas) must be contained within a single degenerate eigenspace of N A . Therefore, if H A (meas) contains more than one degenerate eigenspace of N A , the apparatus must be prepared in a state with a large uncertainty in N A . Now assume that I is an extremal instrument, i.e., that for any λ ∈ (0, 1), the operations of I admit a decomposition I x (·) = λ I (1) x (·) + (1 − λ) I (2) x (·) only if I = I (1) = I (2) . In such a case, it follows that if M := (H A , ξ, E, Z) is a measurement scheme for I, then for any pure state decomposition ξ = i q i P ϕi , it holds that (H A , ϕ i , E, Z) must also be measurement schemes for I. By the above arguments, it follows that unless N A is effectively degenerate, then each ϕ i must have a large uncertainty in N A , i.e., the quantum Fisher information Q(N A , ξ) must be large.

L Faithful fixed states and measurement disturbance
Recall from Proposition D.1 and Theorem H.1 that under a conservation law, an E-instrument I will no disturb an observable F only if In Appendix (D) we saw that the implication F ⊂ F(I * X ) =⇒ F 2 ⊂ F(I * X ) holds if either F is sharp, rank-1, or a coarse-graining of a sharp observable. We now show that if F(I * X ) is a von Neumann algebra, so that F ⊂ F(I * X ) =⇒ F 2 ⊂ F(I * X ), similar and stronger constraints will hold for all observables. To this end, let us first prove a useful lemma. Lemma L.1. Let M := (H A , ξ, E, Z) be a measurement scheme for an E-instrument I acting in H S . Assume that E conserves an additive quantity N = N S ⊗ 1 A + 1 S ⊗ N A on average, where N S ∈ L s (H S ) and N A ∈ L s (H A ), and that F(I * X ) is a von Neumann algebra. Then for all A ∈ L(H S ), the following implication holds: Proof. Recall that for all A ∈ L(H S ), we have I * where Γ E ξ is the channel defined in Eq. (3). Average conservation of N by E implies that Since F(I * X ) is a von Neumann algebra, it follows that for all A ∈ L(H S ), A ∈ F(I * X ) =⇒ A * A, AA * ∈ F(I * X ) which, by Corollary Consequently, we see that A ∈ F(I * X ) =⇒ [A, N S ] ∈ F(I * X ). But as shown in Lemma C.1, if F(I * X ) is a von Neumann algebra then F( We are now ready to prove the following:  Proof. Let us first consider (i). Define the complete mixture ω := 1 S / dim(H S ), which is faithful. It follows trivially that I L X (ω) = ω, and so F(I L X ) contains a faithful state ω.  [78,79]. However, it was shown in [63] that for binary observables, it always holds that F(I L X * ) = E ′ . Since binary observables are commutative, this led to the conjecture that the fixed-point set of the Lüders E-channel is the commutant of E for all commutative observables [78], making F(I L X * ) a von Neumann algebra, which was later proven to be the case [76,77].
Let us highlight an interesting consequence of the above lemma: Corollary L.1. Let M := (H A , ξ, E, Z) be a measurement scheme for an E-compatible Lüders instrument I L acting in H S . Assume that E is commutative, and that E conserves an additive quantity N = N S ⊗ 1 A + 1 S ⊗ N A on average. It holds that E commutes with N S .
Proof. If E ⊂ E ′ , then F(I L X * ) = E ′ is a von Neumann algebra. Moreover, it holds that E ⊂ F(I L X * ), so that the Lüders instrument for a commutative observable is a measurement of the first kind. It follows from item (ii) of Theorem L.1 that E must commute with N S . Proof. Since G(z) are rank-1 effects, we may write G(z) = λ z P z , with λ z ∈ (0, 1] and P z a rank-1 projection. It follows that G ⊂ F(I * X ) =⇒ {P z } ⊂ F(I * X ). But we may write tr[P z I X (P z )] = tr[I * X (P z )P z ] = tr[P z P z ] = tr[P z ] = 1, and so G ⊂ F(I * X ) =⇒ {P z } ⊂ F(I X ). Consequently, we may construct the faithful state ω = z p z P z with p z > 0 and z p z = 1, so that ω ∈ F(I X ). By Lemma B.1, F(I * X ) is a von Neumann algebra.

M Non-faithful fixed states and measurement disturbance
In this section we analyse the structure of the fixed-point set of arbitrary channels, which need not contain a faithful state. From here, the results of the previous section are generalised. We then provide novel quantitative bounds for first-kind measurements which complement our generalisation of the WAY theorem given in Theorem 3.3. Due to the Schauder-Tychonoff fixed point theorem [82], all channels Φ : T (H S ) → T (H S ) have at least one fixed state. However, it may be that none of these are faithful. In such a case, the fixed-point set of the dual channel is not necessarily a von Neumann algebra, but rather forms an operator space [83]. This setting has been much less investigated, and its analysis forms the first part of this section. While the discussion thus far has been applicable for infinite-dimensional systems-except in some examples-in this section we shall always assume that d := dim(H S ) < ∞.

M.1 Fixed-point structure of arbitrary channels
Consider a channel Φ : T (H S ) → T (H S ), and its dual in the Heisenberg picture Φ * : L(H S ) → L(H S ). We may define the channels where Φ n denotes n consecutive applications of Φ. Note that these limits exists since d < ∞. According to the Jordan decomposition theorem, Φ * is represented as a summation of projections onto eigenspaces multiplied by the corresponding eigenvalues, and nilpotent operators whose eigenspaces are invariant subspaces; Φ * av corresponds to the projection onto the subspace with eigenvalue 1. The fixed-point set F(Φ * ) forms an operator space, i.e., a norm-closed vector subspace of the codomain of F(Φ * ), and Φ * av is a CP projection onto F(Φ * ).
The above results have the following useful consequence: Proposition M.1. Consider the operations Φ * av,P and Φ * P defined in Eq. (57). It holds that is a von Neumann algebra in L(P H S ).

M.2 Measurement disturbance revisited
We are now ready to address the question of measurement disturbance, generalising the observations of Theorem L.1. As before, let I := {I x : x ∈ X } be an E-compatible instrument, with I X (·) := x I x (·) the corresponding E-channel. By Eq. where as in Eq. (56), P is the minimal support projection of ρ 0 := I av ( 1 d 1 S ), which corresponds with the minimal projection on the support of F(I X ). By Proposition M.1, F(I * av,P ) = F(I * P ) = P F(I * av )P = P F(I * X )P is a von Neumann algebra in L(P H S ). We define by P EP := {P E(x)P : x ∈ X } the restriction of E to an observable in P H S , which satisfies x P E(x)P = P , and (P EP ) ′ := {A ∈ L(P H S ) : [P E(x)P, A] = O ∀x} denotes the commutant of P EP in L(P H S ). P FP is similarly defined. Before generalising Theorem L.1 for the case where F(I X ) may not contain any faithful states, and thus F(I * X ) may not necessarily be a von Neumann algebra, let us first prove a generalisation of Lemma L.1. Lemma M.4. Let M := (H A , ξ, E, Z) be a measurement scheme for an E-instrument I acting in H S , and let P be the minimal support projection on F(I X ). It holds that F(I * P ) ⊂ (P EP ) ′ . Additionally, if E conserves an additive quantity N = N S ⊗ 1 A + 1 S ⊗ N A on average, where N S ∈ L s (H S ) and N A ∈ L s (H A ), then for all A ∈ L(PH S ) the following implication holds: Proof. By Eq. (3), let us define the operation Γ E ξ,P : It is easily verified that I * P (A) = Γ E ξ,P (A ⊗ 1 A ) for all A ∈ L(P H S ), and P E(x)P = Γ E ξ,P (1 S ⊗ Z(x)). Note that by the same arguments as item (i) of Lemma M.3, it can easily be shown that Γ E ξ,P (B) = Γ E ξ,P ((P ⊗ 1 A )B(P ⊗ 1 A )) holds for all B ∈ L(H S ⊗ H A ). It follows that Γ E ξ,P is unital when restricted to L(P H S ⊗ H A ) → L(P H S ), and we may equivalently write P E(x)P = Γ E ξ,P (P ⊗ Z(x)). By Proposition M.1, F(I * P ) is a von Neumann algebra in L(P H S ), and so for all A ∈ L(P H S ), if A ∈ F(I * P ), then A * A, AA * ∈ F(I * P ). By Corollary A.1 it holds that for all A ∈ F(I * P ) and B ∈ L(P H S ⊗ H A ) we have ). It follows that for all A ∈ F(I * P ) and x ∈ X we have and so F(I * P ) ⊂ (P EP ) ′ . Now let us assume that E conserves an additive quantity N = N S ⊗ 1 A + 1 S ⊗ N A on average. This implies that P Γ ξ (N )P = Γ E ξ,P (N ) = Γ E ξ,P ((P ⊗ 1 A )N (P ⊗ 1 A )), and so Since F(I * P ) is a von Neumann algebra in L(P H S ), then by Corollary A.1 and the arguments above, it follows that for all A ∈ F(I * P ) we have We thus have A ∈ F(I * P ) =⇒ [A, P N S P ] ∈ F(I * P ), and since F(I * P ) ⊂ (P EP ) ′ , it follows that [A, P N S P ] ∈ (P EP ) ′ .
We are now ready to generalise Theorem L.1. Proof. (i): By Proposition M.1, P F(I * X )P = F(I * P ), and so F ⊂ F(I * X ) implies that P FP ⊂ F(I * P ). By Lemma M.4, it holds that P FP must commute with P EP , and that [P F(y)P, P N S P ] = I * P ([P F(y)P, P N S P ]). Given that F(I * P ) is a von Neumann algebra, we have I * P ((P F(y)P ) 2 ) = I * P (P F(y)P ) 2 = (P F(y)P ) 2 . By Corollary A.1 it follows that [P F(y)P, P N S P ] = [P F(y)P, I * P (P N S P )]. By item (i) of Lemma M.3, it holds that I * P (P N S P ) = I * P (N S ) = P I * X (N S )P . Therefore, P FP must commute with P ∆N S P . Moreover, by Lemma M.4 non-disturbance implies that [P F(y)P, P N S P ] ∈ (P EP ) ′ , and since P FP commutes with P EP , this implies that [P F(y)P, [P E(x)P, P N S P ]] = O must hold. has at least one eigenvector with eigenvalue 1, and so ∥G(z)∥ = 1. Moreover, if F(I X ) contains a faithful state, then P = 1 S , and so we have G(z) = P G(z)P = R(z), implying that G ≡ R is sharp. Now let us note that the family of states {ρ z } are perfectly distinguishable given a measurement of G if and only if ρ z = P(z)ρ z P(z), where P(z) ⩾ R(z) projects onto the eigenvalue-1 eigenspace of G(z). In such a case it trivially holds that tr[G(y)ρ z ] = δ z,y . But Since G ⊂ F(I * X ), we also have tr[G(y)I X (ρ z )] = tr[I * X (G(y))ρ z ] = tr[G(y)ρ z ] = δ z,y , and so {I X (ρ z )} continues to be perfectly distinguishable by a G measurement.
In the special case where I is a measurement of the first kind, we may strengthen the above result as follows: Theorem M.2. Let I be an instrument compatible with a non-trivial observable E acting in H S . If I is a measurement of the first kind, then E is described by a classical post-processing of a norm-1 observable G := {G(z) : z ∈ Z} with properties given in Proposition M.2, i.e., where {p(x|z)} is a family of non-negative numbers that satisfy x p(x|z) = 1 for each z.
Proof. Assume that the E-instrument I is a measurement of the first kind, that is, E ⊂ F(I * X ). It follows that P EP ⊂ F(I * av,P ). In fact, we can show that P EP ⊂ F(I * av,P ) ∩ F(I * av,P ) ′ . First, recall from Proposition M.2 that if E is non-trivial then there exists a family of projections R := {R(z) : z ∈ Z} ⊂ F(I * av,P ), satisfying R(z)R(y) = δ z,y R(z) and z R(z) = P . If P EP ̸ ⊂ F(I * av,P ) ∩ F(I * av,P ) ′ , then R can be chosen so that [P E(x)P, R(z)] ̸ = O for some x and z. But note that R ⊂ F(I * av,P ) = P F(I * X )P implies that the observable {P I * x (R(z))P, P I * x (1 S − R(z))P : x ∈ X } is a joint measurement for P EP and the sharp observable {R(z), P − R(z)}. By compatibility, it follows that [P E(x)P, R(z)] = O must hold for all x and z, and so P EP must be contained in the Abelian algebra F(I * av,P ) ∩ F(I * av,P ) ′ . Consequently, R ⊂ F(I * av,P ) ∩ F(I * av,P ) ′ can be chosen so as to simultaneously diagonalise all P E(x)P , that is, we may write P E(x)P = z p(x|z)R(z). Recalling that E(x) = I * av (E(x)) = I * av (P E(x)P ), then defining the observable G by G(z) = I * av (R(z)) gives us Eq. (58). As in Proposition M.2 it holds that G ⊂ F(I * X ), G is a norm-1 observable, if F(I X ) contains a faithful state then G is also sharp, and if {ρ z } are perfectly distinguishable by a G measurement then so are {I X (ρ z )}.
Finally, we present the following implication of the above theorem: Corollary M.2. Let I be an instrument compatible with a non-trivial observable E acting in H S , and assume that I is a measurement of the first kind. For any outcome x associated with a non-trivial effect E(x), and for any pair of unit vectors ψ, ϕ ∈ H S satisfying E(x)ψ = ∥E(x)∥ψ and (1 S − E(x))ϕ = ∥1 S − E(x)∥ϕ, respectively, it holds that ψ and ϕ are orthogonal, and that F (I X (P ψ ), I X (P ϕ )) = 0, where F (ρ, σ) := tr[ √ ρσ √ ρ] is the fidelity between states ρ and σ.
Proof. For each outcome x, we may coarse-grain E into a binary observable Using such sets, we may define G(Z max ) := z∈Zmax G(z) and G(Z min ) := z∈Zmin G(z). Since G is norm-1, then we may also define P(Z max ) := z∈Zmax P(z) and P(Z min ) := z∈Zmin P(z), where P(z) is the projection onto the eigenvalue-1 eigenspace of G(z). Since E(x) is assumed to be non-trivial, then it must hold that Z max ∩ Z min = ∅. If this were not so, it would hold that all p(x|z) are the same, in which case E(x) ∝ 1 S . Consequently, P(Z max )P(Z min ) = O. Now let us note that ∥E(x)∥ = sup ∥ψ∥=1 ⟨ψ|E(x)ψ⟩ = sup ∥ψ∥=1 z p(x|z)⟨ψ|G(z)ψ⟩.
We may now show that a unit vector ψ satisfies E(x)ψ = ∥E(x)∥ψ if and only if P(Z max )ψ = ψ. Let us first prove the only if statement. For any unit vector ψ, it holds that ⟨ψ|E(x)ψ⟩ ⩽ p max (x), which follows from the fact that p(x|z) are positive numbers and that {⟨ψ|G(z)ψ⟩} is a probability distribution, with the upper bound being saturated when ⟨ψ|G(Z max )ψ⟩ = 1. But this in turn is satisfied only if ⟨ψ|P(Z max )ψ⟩ = 1, in which case P(Z max )ψ = ψ. As such, it follows that ∥E(x)∥ = p max (x), and the unit vector ψ satisfies E(x)ψ = ∥E(x)∥ψ only if P(Z max )ψ = ψ. The if statement is trivial. By similar arguments as above, we may show that ∥1 S − E(x)∥ = ∥E(x)∥ = p max (x) = 1 − p min (x), and that the unit vector ϕ satisfies (1 S − E(x))ϕ = ∥1 S − E(x)∥ϕ if and only if P(Z min )ϕ = ϕ. Since E(x) is nontrivial, then as argued above ψ and ϕ are orthogonal, and perfectly distinguishable by a G measurement. By Proposition M.2 it holds that I X (P ψ ) and I X (P ϕ ) are also perfectly distinguishable by a G measurement, that is, F (I X (P ψ ), I X (P ϕ )) = 0.
M.4 Measurements of the first kind, distinguishability, and the Wigner-Araki-Yanase theorem We shall now use the results in the preceding section to obtain quantitative bounds for first-kind measurements in the presence of a conservation law, that complement our generalisation of the WAY theorem given in Theorem 3.3. To this end, let us first provide a generalisation of Theorem 2 in Ref. [27], which we shall use in the sequel: Lemma M.5. Let M := (H A , ξ, E, Z) be a measurement scheme for an E-instrument I acting in H S , and assume that E conserves an additive quantity N = N S ⊗ 1 A + 1 S ⊗ N A on average, where N S ∈ L s (H S ) and N A ∈ L s (H A ). For any pair of orthogonal unit vectors ψ, ϕ ∈ H S , the following will hold: |⟨ψ|N S ϕ⟩| ⩽ ∥N A ∥F (I X (P ψ ), I X (P ϕ )) + ∥N S ∥F (Λ(P ψ ), Λ(P ϕ )), where Λ is the conjugate channel to I X defined in Eq. (5), and F (ρ, σ) is the fidelity between states ρ and σ.
Proof. Let us consider the augmented Hilbert space H A ⊗ K so that ξ ∈ S(H A ) admits the purification ξ = tr K [P φ ], with the unit vector φ ∈ H A ⊗ K. Moreover, if K is sufficiently large, then by Stinespring's dilation theorem the channel E * can be expressed as E * (A) = V * (A ⊗ 1 K )V for all A ∈ L(H S ⊗ H A ), where V : H S ⊗ H A → H S ⊗ H A ⊗ K is an isometry. By additivity of N , and orthogonality of ψ, ϕ, we have ⟨ψ ⊗ φ|N ϕ ⊗ φ⟩ = ⟨ψ|N S ϕ⟩.
On the other hand, average conservation of N by E implies that We therefore have For any observable G := {G(z)} acting in H A , we may write In the third line we have used the Cauchy-Schwarz inequality, in the fourth line we used Stinespring's dilation theorem together with the fact that φ is a purification of ξ, and in the final line we use the definitions of the partial trace and the conjugate channel Λ. Now, note that the fidelity satisfies F (ρ, σ) = min G z tr[G(z)ρ] 1 2 tr[G(z)σ] 1 2 [95,96]. Therefore, choosing G so as to obtain the fidelity, we have |⟨ψ ⊗ φ|V * (N S ⊗ 1 A ⊗ 1 K )V ϕ ⊗ φ⟩| ⩽ ∥N S ∥F (Λ(P ψ ), Λ(P ϕ )).
We are now ready to prove our main result in this section: Theorem M.3. Consider a measurement scheme M := (H A , ξ, E, Z) for a nontrivial observable E with the instrument I acting in H S . Assume that I is a measurement of the first kind, and that E conserves an additive quantity N = N S ⊗ 1 A + 1 S ⊗ N A on average, where N S ∈ L s (H S ) and N A ∈ L s (H A ). For each outcome x associated with a non-trivial effect E(x), let K max (x) and K min (x) be subspaces of H S defined by K max (x) := {ψ ∈ H S : E(x)ψ = ∥E(x)∥ψ}, K min (x) := {ϕ ∈ H S : (1 S − E(x))ϕ = ∥1 S − E(x)∥ϕ}.
K max (x) and K min (x) are orthogonal, and for all unit vectors ψ ∈ K max (x) and ϕ ∈ K min (x), it holds that Proof. For each outcome x associated with a non-trivial effect E(x), we may coarse-grain E into a binary observable {E(x), E(x) := 1 S − E(x)}. By Corollary M.2, it holds that for any unit vectors ψ ∈ K max (x) and ϕ ∈ K min (x), ψ and ϕ are orthogonal-implying that K max (x) and K min (x) are orthogonal subspaces-and F (I X (P ψ ), I X (P ϕ )) = 0. As such, given the average conservation of N by the interaction channel E, Lemma M.5 implies that the following inequality must hold: where P = P 0 (x) + P 1 (x), with P 0 (x) and P 1 (x) orthogonal projections onto the eigenvalue-0 and eigenvalue-1 eigenspaces of E(x), respectively.