Recovery With Incomplete Knowledge: Fundamental Bounds on Real-Time Quantum Memories

The recovery of fragile quantum states from decoherence is the basis of building a quantum memory, with applications ranging from quantum communications to quantum computing. Many recovery techniques, such as quantum error correction, rely on the apriori knowledge of the environment noise parameters to achieve their best performance. However, such parameters are likely to drift in time in the context of implementing long-time quantum memories. This necessitates using a"spectator"system, which estimates the noise parameter in real-time, then feed-forwards the outcome to the recovery protocol as a classical side-information. The memory qubits and the spectator system hence comprise the building blocks for a real-time (i.e. drift-adapting) quantum memory. In this article, I consider spectator-based (incomplete knowledge) recovery protocols as a real-time parameter estimation problem (generally with nuisance parameters present), followed by the application of the"best-guess"recovery map to the memory qubits, as informed by the estimation outcome. I present information-theoretic and metrological bounds on the performance of this protocol, quantified by the diamond distance between the"best-guess"recovery and optimal recovery outcomes, thereby identifying the cost of adaptation in real-time quantum memories. Finally, I provide fundamental bounds for multi-cycle recovery in the form of recurrence inequalities. The latter suggests that incomplete knowledge of the noise could be an advantage, as errors from various cycles can cohere. These results are illustrated for the approximate [4,1] code of the amplitude-damping channel and relations to various fields are discussed.

The recovery of fragile quantum states from decoherence is the basis of building a quantum memory, with applications ranging from quantum communications to quantum computing.Many recovery techniques, such as quantum error correction, rely on the apriori knowledge of the environment noise parameters to achieve their best performance.However, such parameters are likely to drift in time in the context of implementing long-time quantum memories.This necessitates using a "spectator" system, which estimates the noise parameter in realtime, then feedforwards the outcome to the recovery protocol as a classical side-information.The memory qubits and the spectator system hence comprise the building blocks for a realtime (i.e.drift-adapting) quantum memory.In this article, I consider spectator-based (incomplete knowledge) recovery protocols as a real-time parameter estimation problem (generally with nuisance parameters present), followed by the application of the "best-guess" recovery map to the memory qubits, as informed by the estimation outcome.I present information-theoretic and metrological bounds on the performance of this protocol, quantified by the diamond distance between the "best-guess" recovery and optimal recovery outcomes, thereby identifying the cost of adaptation in real-time quantum memories.Finally, I provide fundamental bounds for multicycle recovery in the form of recurrence inequalities.The latter suggests that incomplete knowledge of the noise could be an advantage, as errors from various cycles can cohere.These results are illustrated for the approximate [ Quantum memories comprise an important component of current and future quantum technologies.Their use ranges from quantum communications and networks [1][2][3] to sensing [4], and even computation.This wide range of relevance stems from the fact that a quantum memory preserves a quantum system's (often fragile) state from decoherence, which encodes the desired quantum information.
Depending on the physical implementation of the quantum memory, current coherence times range from milliseconds to minutes [5].However, various quantum technologies may require even longer coherence times [6].Two of the most common techniques implemented in a quantum memory are quantum error correction (QEC) [7][8][9] and dynamical decoupling [10,11].These techniques generally benefit from the apriori knowledge of the noise surrounding the system of interest.For example, channel-adaptation techniques in QEC [12] have been shown to outperform general QEC codes, as they are given additional knowledge of the environment noise [13,14].Such techniques rely on some physical model of the (noisy) implementation medium of the quantum memory.The noise model is partially built upon physical assumptions (e.g. in the choice of the Hamiltonians) and partially upon phenomenology.Hence, the former gives a physically motivated family of quantum dynamics {N θ } θ∈Θ for the quantum state of the memory qubits [15], and the latter determines the value of the noise parameter θ such that the dynamics N θ fits the observed data the best.
Although very powerful, a shortcoming of this approach is that the environment noise parameter is generally time-varying.This has been studied most extensively for superconducting qubits [16][17][18][19][20]. Hence, real-time techniques to track the change (drift) of the noise are necessary, assuming we want to operate quantum memories for times larger than the characteristic times of the drift.Indeed, efforts have been made towards designing "spectator" systems that aid in detecting and tracking such changes [21][22][23][24][25][26][27][28][29][30].Being subject to the same physical environment, the goal of the spectator system is to perform real-time quantum sensing of the noise parameter.The estimate is then used as a classical side-information in various recovery protocols.The physical requirements of spectator systems are twofold: (1) proximity to the memory qubits, such that the spatial dependence of the noise parameter can be neglected, and (2) exhibiting faster dynamics than the memory qubits.The latter is necessary if the feedback information is to be useful for recovery.We showcase the functionality of the spectator system within a quantum memory using Figs. 1 and 2. Note that, since the memory and spectator systems are generally different physical systems with different couplings to the same environment, their dynamics will generally be described by different quantum channels with the same noise parameter, i.e.N θ and M θ , respectively.More precisely, a spectator-based recovery protocol is comprised of the following characteristic stages, as shown in Fig. 3: 1.Individual state preparation of the quantum memory and the spectator system.
2. Free evolution of the joint memory-spectator system under the action of the shared environment, with generally unknown noise parameters.
3. Quantum parameter estimation of the environment noise parameters, using the spectator system as a real-time quantum sensor (i.e. a probe).
4. Post-processing of the measurement outcomes to extract the value of the locally unbiased estimator, and use it to construct the best-guess recovery map.
5. Recovery of the original state of the quantum memory by applying the best-guess recovery map.
6. Recycling of spectator state, which prepares it for the next recovery cycle.
There exist systems that satisfy the physical properties of a spectator system.For example, nitrogenvacancy (NV) centers in diamond, which were used to prove the first loophole-free Bell inequality violation [31], provide both a spectator qutrit and a memory qubit.Namely, the nuclear spin degree of freedom of its 14 N or 15 N atom comprises the memory qubit, whereas a nearly-closed three-level Λ system [32,33], optically selected out of the electronic degrees of freedom of the NV center, comprises the spectator system [27,34,35].The two-time separation between the pure dephasing times of the memory qubit (T memo φ ∼ 100µs) and the spectator qutrit (T spec φ ∼ 100ns [36]) is necessary to simultaneously yields (i) a metrologically useful spectator state M θ (ψ) for parameter estimation, and (ii) a relatively small noise parameter value of the memory dynamics N θ (and hence a generally higher recovery fidelity), for relevant times t of the spectator dynamics.The latter is seen from the implicit dependence of the noise parameter θ on time [37]: Although spectator systems are a promising building block for real-time (i.e.drift-adapting) quantum memories, we expect fundamental limitations to manifest nonetheless.This is based on the following physical intuition: In real-time, the spectator system's goal is to perform a quantum estimation of the environment noise parameter θ.However, due to the quantum Cramér-Rao bound (QCRB) θ(x) Figure 2: Recovery with limited knowledge (time flows from left to right).The quantum memory (second register) and spectator (first register) systems are prepared in the quantum states ρ and ψ, respectively.The recovery channel R θ is implemented based on the spectator's best estimate θ of the noise parameter θ ∈ Θ of the environment noise N θ .The estimate is informed by the measurement outcome x of the spectator observable X, following the spectator dynamics M θ .[38,39], any locally unbiased estimate θ of the noise parameter θ will have a non-zero variance.Namely, Var( θ) ≥ 1/I QF (M θ (ψ)), where I QF (M θ (ψ)) is the quantum Fisher information (QFI) of the family of parametric states {M θ (ψ)} θ∈Θ describing the spectator dynamics (see Fig. 2).For a given setup, this fundamental uncertainty in the estimate θ will propagate within the overall protocol and manifest itself as a fundamental limitation of the specific recovery technique.
In this article, I formalize the above intuition by proving information-theoretic and metrological bounds on the performance of spectator-based recovery protocols for adaptive quantum memories, considering QEC as an example.The main results of the article are summarized below 1. Derivation of a lower bound for the diamond distance between any two quantum channels (Lemma 2).This generalizes a lower bound for the diamond distance between a quantum channel and the identity channel in [40], a qudit depolarizing channel and the identity channel in [41], as well as the analytic formula in [42] for the diamond distance between two-qubit depolarizing channels.A similar lower bound is shown for generalized distinguishability measures, such as entropy or fidelity-based distinguishability measures (e.g.quantum relative entropy or Bures distance, respectively) is found in Theorem 4 of Appendix A.  The letters "M " and "S" stand for quantum memory and spectator system, respectively.The black dots represent the environment spins that contribute to the noise.This protocol combines two important disciplines of quantum information theory: quantum parameter estimation and recovery of quantum information.
covery protocol as two consecutive, but complementary, tasks in a real-time quantum memory (Section 4.2): (i) a multi-parameter quantum estimation in the presence of nuisance parameters, and (ii) recovery of quantum information using the "best-guess" recovery map, as informed by the estimation outcome.
3. Derivation of Information-theoretic costs of adaptation in spectator-based recovery protocols.This is shown both for finite (Theorem 1) and small (Theorem 2) estimation errors of the noise parameters.The latter yields a metrological lower bound, in terms of the quantum Fisher information of the spectator dynamics.The adaptation cost is illustrated for the [4,1] code of the amplitude-damping channel (Figs. 6 and 7), by comparing the performance of the spectatorbased recovery protocol with the corresponding optimal recovery protocol [12,14] when no adaptation is required.
4. Reformulation of an upper bound for the entanglement fidelity of concatenated quantum channels (building upon a theorem of [43]) in the form of recurrence inequalities for multi-cycle recovery protocols (Lemma 4).These bounds are growing in relevance, as multiple QEC rounds have been demonstrated in practice [44].It is shown that spectator-based recovery protocols, under conditions of varying noise, could outperform optimal recovery protocols with constant noise (Theorem 3).This is exclusively a multi-cycle phenomenon, where errors from different cycle numbers can cohere [43], including errors from incomplete knowledge of the noise parameter.Finally, this is illustrated for the [4,1] code of the amplitude-damping channel (Fig. 9).Similar error coherence effects have been seen recently in the context of quantum state transfer in quantum networks, between gate and readout errors [45].

Quantum States and Channels
Let H denote a Hilbert space, and L(H) be the set of bounded linear operators acting on H. Denote by L + (H) the subset of positive semi-definite operators of L(H).We define the Hilbert-Schmidt inner product between two linear operators A, B ∈ L(H) to be ⟨A, B⟩ := Tr[A † B].The state of a physical system is described by a density matrix ρ ∈ D(H), where D(H) is the subset of positive semi-definite linear operators L + (H) that have a unit trace.We denote the dimensions of a Hilbert space H by d := dimH.
A linear map from A positive linear map Q A→B is called completely positive (CP) if for every Hilbert space H R , the map id R ⊗ Q A→B is positive, where id R is the identity map acting on L(H R ).For any Q A→B and S B→C linear maps, their composition is defined to be (S • Q) A→C (L A ) := S(Q(L A )), for all L A ∈ L(H A ).We define the adjoint map Q A→B † of a linear map Q A→B with respect to the Hilbert-Schmidt inner product as i=1 are the Kraus operators of the CP map Q A→B (see Eq. (2)), then the Kraus operators of the adjoint map Q A→B † are given by {Q where In what follows, we suppress the system subscript and/or superscript if it does not lead to ambiguities.Every CP map Q A→B admits a Kraus decomposition in terms of Kraus operators

Diamond Distance Between Quantum Channels
The trace norm of any linear operator L ∈ L(H) is given as ∥L∥ 1 := Tr (|L|), where we have denoted by |L| := √ L † L. Therefore, we define the trace distance between any two quantum states ρ, σ ∈ D(H) to be 1 2 ∥ρ − σ∥ 1 .More generally, we define the diamond distance 1  2 ∥Q A −S A ∥ ⋄ between two quantum channels Q A and S A as follows (3)

Quantum Fidelities
Given any two quantum states ρ, σ ∈ D(H) with dimH ≡ d, we define the fidelity function as follows If one of the two state, say σ ≡ |ψ⟩⟨ψ|, is pure, then we have F (ρ, ψ) = ⟨ψ|ρ|ψ⟩.Based on this definition, various fidelities that are relevant in QEC and other areas of quantum information have been defined.One such quantity is called entanglement fidelity F e of a channel Q A ≡ Q A→A with respect to a state ρ ∈ D(H A ) [7,8], which is given by where It can be shown that the entanglement fidelity is independent of the particular choice of the purification, following from the fact that the former can be expressed in terms of the Kraus operators of Q A as [7,8] The entanglement fidelity of a quantum channel Q A is defined to be the entanglement fidelity of Q A with respect to the maximally mixed state ρ A = I A /d [46] (which is purified by the maximally entangled state |Φ⟩ RA ).This can also be written in terms of the Choi state of Q A , as follows Another important fidelity measure of the form F (ρ, ψ) = ⟨ψ|ρ|ψ⟩ is the average channel (gate) fidelity F avg (Q), defined for any |ψ⟩ ∈ H A and CPTP map Q A as where the discrete version has appeared in [7,8].In [47], the authors have shown that the average and entanglement fidelities are related by Finally, in what follows, we also use the simplifying notation which can be interpreted in the χ-matrix representation of quantum channels (see Appendix I) as the "error angle" by which the Kraus operators of the noisy channel Q deviate from the desired "no error" normalized basis element B 0 = I/ √ d (where ⟨B 0 , B 0 ⟩ = 1) of the vector space L(H).It turns out that the error angle notation is very convenient when expressing the average fidelity of composite channels in terms of the individual average channel fidelities [43].

Lower-Bounding Diamond Distance Using Entanglement Fidelity
We now consider the diamond distance between any two quantum channels and show that it is lower bounded by the difference between their entanglement fidelities.This is generalized in Theorem 4 of Appendix A, where we show that the lower bound for any generalized distinguishability measure between the two channels is still fully determined by their entanglement fidelities.Besides the diamond distance considered here, fidelity and entropy-based distinguishability measures, such as Bures distance and quantum relative entropy, are also used to quantify the performance of recovery protocols, e.g. in Refs.[48,49] and Refs.[50,51], respectively.Therefore, the results of this section (as well as the following sections) could be generalized for other distinguishability measures, in the light of Appendix A.
The diamond distance is especially relevant for two reasons: (1) it has a clear operational meaning in terms of the maximum probability of distinguishing between two channels in a quantum channel discrimination task [52], and (2) it satisfies the triangle inequality and hence also the chaining property, which is useful for bounding errors in fault-tolerant quantum computing (see Appendix B).
We start by proving the following: Lemma 1.For any two depolarizing channels QA→A and SA→A , the diamond distance between them is equal to the difference between their entanglement fidelities, namely Proof.Assume that Q and S are depolarizing channels with depolarizing parameters p Q and p S , respectively.Namely, and hence yields for the diamond distance where and we have used Eq. ( 13).We can rewrite κ using the diamond distance between the identity and the replacement channel, as follows Next, we use the semi-definite program for the normalized diamond norm [53] where the supremum is taken over all positive matrices 0 (28) As we can see, only one of the eigenvalues of Γ RA − I RA /d is positive.Therefore, we write the spectral decomposition of the matrix Γ RA − I RA /d as follows where {|γ i ⟩} d 2 −1 i=0 is its orthonormal eigenbasis.It follows that, to maximize the argument of Eq. ( 27), we need to consider the support of the positive semi-definite operator σ RA to be in the (onedimensional) support of Γ RA (which is orthogonal to ker(Γ RA )), i.e. we need to search for σ RA in the form Consequently, if there exists ρ R ∈ D(H R ) such that σ RA is fully in the support of Γ RA , i.e. σ RA = z|γ 0 ⟩⟨γ 0 | RA (where Γ|γ 0 ⟩ = d|γ 0 ⟩), then it must be the case that the normalization z ≤ 1/d.The resulting maximization in Eq. ( 27) will thus yield for κ(d) or equivalently, In the above analysis, we presumed the existence of a density matrix ρ R for which  30)) still yields z ≤ 1/d, while simultaneously leading to a sub-optimal outcome in Eq. (34) due to the contribution of the negative eigenvalues of Γ RA −I RA /d.Taking σ RA to be off-diagonal in the {|γ i ⟩} d 2 −1 i=0 does not change this argument.
It is important to note that this lemma has been known previously for special cases, e.g. in [41,42] between qubit depolarizing maps (d = 2) and between a qudit depolarizing map and the identity map (p S = 0), respectively.However, Pirandola et al. in [41] used a different technique to compute essentially the same quantity κ(d) appearing in our derivation of Lemma 1, which crucially does not depend on the depolarizing parameters p Q and p S .
We now prove a lower bound for the diamond distance between any two quantum channels with the same input and output spaces.We frame this as follows Lemma 2. For any two CPTP maps Q A→A and S A→A , the diamond distance between them is lower bounded by the difference in their entanglement fidelities, namely Proof.It is known that any quantum supermap (a linear map from one quantum channel to another) that is a convex combination of Pauli unitary supermaps (also known as "twirling") renders any input channel Q into a depolarizing channel Q [54], with the same entanglement fidelity.The Lemma is then a direct consequence of applying the data-processing inequality to the diamond distance ∥Q − S∥ ⋄ with respect to the Pauli twirling supermap [55], which yields where Q and S are the resulting depolarizing channels [47] (also see Lemma 6 in Appendix A).Then, we note that the right-hand side is found from Lemma 1.
Finally, the proof is completed by the fact that any random unitary supermap preserves the entanglement fidelity [47], hence F e ( Q) = F e (Q) and F e ( S) = F e (S).
Remark 1.If one of the quantum channels is the identity, then a much simpler derivation could be found in the supplementary material of [40] using the Fuchs-van de Graaf inequality for quantum channels, which yields a two-sided bound on the diamond distance.

Fundamental Bounds on Recovery with Incomplete Knowledge
Here, we are interested in applying the lower bound derived in Lemma 2 of the previous section to the spectator-based recovery setting, succinctly described in Fig. 2. We will assume that we are given a parametric family of quantum channels {N θ } θ∈Θ that is motivated from certain physical assumptions about the memory-environment interaction, where Θ is the allowed range of values for the noise parameter θ.

The Regime of Validity
Let us now identify four different noise instability regimes and then expand upon the relevant regime for this article.The four cases are described as follows: 1.When neither the noise family {N θ } θ∈Θ nor the true noise parameter θ change in time.This case is best described by the "perfect" knowledge scenario in Fig. 1, and is the most common in literature.
2. When the noise family {N θ } θ∈Θ remains valid, but the true noise parameter θ varies stroboscopically.Namely, the characteristic timescale τ θ for the variations of the true value of θ is larger compared to the duration of a single recovery cycle ∆t R .Therefore, variations in the noise parameter are on the timescale of multiple recovery cycles (see e.g.[16] for superconducting qubits).In this regime, performing the real-time quantum estimation of the new value of θ for every recovery cycle becomes useful, hence the need for a spectator system.This regime is best described by the "incomplete" knowledge scenario in Fig. 2.
In this case, the usefulness of the classical sideinformation in the spectator-based recovery in Fig. 2 is no longer clear.Instead, this noise regime might benefit from continuously applied recovery, e.g.[56].Alternatively, a robustness approach (as opposed to adaptation) might also be suitable (see below).
4. When the noise family {N θ } θ∈Θ is changing within a single recovery cycle.This noise regime will mainly benefit from the design of robust recovery protocols, e.g. in [57][58][59][60], rather than the spectator-based recovery presented in Fig. 2.
In the rest of the article, we focus on the stroboscopic noise regime, and consider the advantages and limitations of using the spectator-based recovery protocol in Fig. 2.
To measure the success of the recovery protocol, we recall that the goal is to achieve a complete (or, at least, an approximate) recovery R • N θ (ρ) = ρ of the noisy channel N θ for a subset of states ρ ∈ D(C) ⊂ D(H) in the codespace C. The "optimality" of the recovery map R for a given noisy channel N θ could be quantified in various ways.Motivated by Lemma 2 for the diamond distance (and more generally Theorem 4 for all distinguishability measures), we choose the entanglement fidelity to be the quantifier of the optimal recovery, i.e.
Indeed, entanglement fidelity has been used as a figure of merit for QEC in e.g.[12,14,46,57,58,61].We note that the concatenated form R • N θ of the memory dynamics presupposes that the recovery map R is applied much faster than the noisy dynamics.In what follows, we shall compare two different scenarios: • Optimal recovery scenario (Fig. 1), which corresponds to the optimal choice of the recovery map R θ (as defined in Eq. (40)) for the noisy channel N θ , where the value of θ ∈ Θ is completely known.
• Best-guess recovery scenario (Fig. 2), which corresponds to the optimal recovery choice R θ (as defined in Eq. ( 40)) for the estimated noisy channel N θ , where θ is the best estimate of θ.The latter is defined to be the minimum variance unbiased estimator (MVUE).
θS (x) Remark 2.Not all QEC codes require a recovery channel R θ that depends on the noise parameter θ.Such channels are known to be Pauli channels, where the recovery operation is fully determined by a subset of Pauli operators from the general Pauli group, as is known in the stabilizer formalism of QEC [37,62].Therefore, in what follows, we consider non-Pauli channels, of which, the generalized amplitude damping channel is a prime example [37].

A General Framework For Spectator-Based Recovery Protocols
In this section, we formulate the spectator-based recovery protocol (Fig. 2) as the combination of two consecutive processes: (i) parameter estimation of the memory noise, using the spectator system, and (ii) application of the corresponding "best-guess" recovery map (informed by the estimated value of the noise parameter(s), rather than the true value) to the quantum memory.It is important to emphasize that the parameter estimation stage of spectator-based recovery is more resource efficient than a direct process tomography of the noise.This is due to the prior knowledge of the parametric family of noisy quantum channels, either from physical consideration, or even initial process tomography.This efficiency is especially important for operating a real-time quantum memory, which requires such resources to be replenished after each recovery cycle.
To set up a general framework for spectator-based recovery protocols, we start by considering the origins of the reduced memory (M ) and spectator (S) dynamics in the general multi-parameter regime.In what follows, M S denotes the mother system for both the quantum memory M and the spectator system S, and its Hilbert space has the tensor product structure H M S = H M ⊗ H S .Since both systems are subject to the same local environment, we assume that there is a mother channel Z M S θ (with a noise parameter vector θ ∈ Θ p , where Θ p is a p-dimensional parameter space) that acts on the memory and spectator sys-tems collectively, as shown in Fig. 4. Without loss of generality, we assume that the components of the multiparameter vector (θ 1 , • • • , θ p ) ∈ Θ p are treated as independent parameters, namely that none of the components could be expressed as a function of the rest, e.g.
Otherwise, for every such constraint, the number of independent parameters is reduced by one.The mother channel yields the reduced dynamics where the partial tracing with respect to S in the first equation defines the partition of the global set of noise parameters θ into a subset of relevant parameters θ M ∈ Θ p M for the reduced dynamics of M (where in general p M ≤ p) and a complementary set of parameters θ M ⊥ ∈ Θ p−p M that is irrelevant for the reduced dynamics (namely A similar partition of θ = {θ S , θ S⊥ } to relevant and irrelevant parameters for the spectator dynamics occurs when applying the partial trace with respect to M to the mother channel (please see Appendix D for details on when this is possible).With these two natural partitions {θ M , θ M ⊥ } and {θ S , θ S⊥ } of the parameter vectors θ, we can also decompose the latter via a joint partition, as follows From this joint partition, it is clear that the spectator system S can only help estimate the subset of memory noise parameters θ S I ≡ θ S ∩θ M ⊆ θ M (where the subscript "I" stands for "interest" parameters, as opposed to the "nuisance" parameters θ S N ≡ θ S ∩ θ M ⊥ ̸ ⊆ θ M of the spectator dynamics [63], which are irrelevant for the memory dynamics).Therefore, we see that it is necessary to have Θ p M ⊆ Θ p S for the quantum estimation task via the spectator system to be useful for identifying the relevant noise parameters θ M which affect the quantum memory.If this is not the case, then one would require multiple spectators S 1 , S 2 , • • • , S l (from potentially different physical systems) such that the parameter space Θ p M is contained by the combined parameter subspaces ∪ l i=1 Θ p S i .For simplicity, we assume in the rest of the article that p S I = p M .In general, the reduced states of S and M following the application of Eqs.(41) and (42) will depend on the global input state M S.This observation still holds even if the input state of M S is of product form ρ M S = ρ M ⊗ ψ S .Therefore, to arrive at local noise channels N M θ M and M S θ S (as shown in Fig. 2) that are independent of the input states of S and M , respectively, the mother channel itself has to be separable.Namely similar to the independent noise approximation that is commonly used in QEC literature.The input state ρ M S = ρ M ⊗ ψ S is selected such that ρ M is the state of the quantum memory that we would like to protect from the noise, whereas ψ S is the probe state of the spectator system which we are technically free to choose to achieve the optimal precision in parameter estimation.
In Appendix E, we review relevant aspects of multiparameter quantum estimation theory.It is known that when there are no nuisance parameters present (p S I = p S , p S N = 0), the quantum estimation limit of the parameters θ S I ≡ θ M is given by the partial symmetric logarithmic derivative (SLD) quantum Fisher information matrix (QFIM), which has dimensions of p S × p S .However, when p S N = p S − p S I > 0 nuisance parameters are present, then the quantum estimation limit will be given by the p S I × p S I partial SLD QFIM, which comprises a tighter lower bound on the estimation variance Var( θS I ) than the p S I × p S I standard SLD QFIM of the parameters θ S I .It is known that these two quantities are equal only if the nuisance parameters are informationally orthogonal to the parameters of interest [64], namely when the SLD QFIM of θ S block diagonalizes, with blocks of dimensions p S I and p S N = p S − p S I , respectively.Furthermore, it has been shown in [63] that, in the single parameter regime p M = 1 (p S I = 1, p S N = p S − 1), the lower bound in the variance of any locally unbiased estimator θS I is achievable via an optimal measurement constructed from the eigenprojectors of the SLD operators of {M θ S (ψ)} θ S .A similar optimal measurement construction achieving the QCRB is not known for p M > 1 in the presence of nuisance parameters.However, when the nuisance parameters are absent, the above optimal measurement saturates the QCRB for any p M ≥ 1 if and only if the SLD operators of different parameters commute [65].In the rest of the article, we consider the single parameter case p M = 1.
Finally, we would like to point out that the main theorems of this manuscript are generalizable to the multiparameter setting p M > 1, however, a full consideration of all its nuances are left for future work.This includes the incompatibility of different parameters [66][67][68], which is an exclusive problem to the multiparameter regime.Further, it is known that the QCRB is a less tight version of the Holevo Cramér-Rao bound (HCRB) in multi-parameter quantum estimation theory.The latter is known to be efficiently computable [4,69], but it is generally saturated for collective measurements over different probes.Instead, if one is interested in local measurements (which is more practical for parameter estimation), then the Nagaoka-Hayashi (NH) bound [70] is the relevant bound, which is also efficiently computable [71].A recent work [72] shows how these Cramér-Rao type bounds are unified under the umbrella of conic linear programming.

Spectator Dynamics With No Nuisance Parameters
In the introduction, as well as Figs. 2 and 6, we have emphasized the fact that the spectator system need not be the same physical system as the computational or memory system.Consequently, the dynamics of the spectator system M θ is generally different from the dynamics N θ = is a quantum channel acting on the i-th subsystem.Note that we assumed negligible spatial variability of the noise parameter θ.The noise separability assumption need not mean that the qubit noises are uncorrelated, as classical correlation between the experienced noise parameters by different qubits is still possible in principle.The above separability assumption only means that the noise correlations are classical and hence non-entangling.

Spectator Qubits As Memory Qubits With Controllable Environment Coupling
To perform recovery with incomplete knowledge, we need to hypothesize a relation between the spectator and memory qubit dynamics, i.e.M (1) θ and N (1) θ .Since both types of qubits are subject to the same noisy environment with potentially different coupling strengths, we hypothesise where f (θ) ∈ [0, 1] is a monotone increasing function of its argument.To justify this choice, consider the case where θ has the following form where T 1 is the spin relaxation time [15,18].This is the case e.g. for the qubit amplitude-damping channel, which we consider both in Section 4 and Section 6.Then, by expressing t/T 1 in terms of θ, we arrive at where The requirement that the spectator qubits should exhibit faster dynamics than the memory qubits translates to γ > 1. Eq. (46) could also be viewed from the point of view of recent progress in quantum control, e.g.via Hamiltonian amplification in bosonic systems [73] or decoherence control in NV centers [74], which yields controllable qubit coupling strengths.If such qubits are used to build a quantum memory, then a portion of these qubits could be reserved as spectators, and the environment coupling could be adjusted to maximize the sensitivity of the spectator qubits (see below).

Quantifying The Physical Choice of Spectator Systems
Given that a particular physical medium (e.g. a spin lattice with multiple spin species A, B, C, • • • ) is populated with memory qubits (say, spin species A), a natural question is whether the spectator qubits should be chosen from the same or different spin species.This question is especially relevant for QEC in hybrid spin registers e.g. in diamond [75,76].To answer this question, we recall that in quantum estimation theory, the sensitivity of the spectator qubit dynamics ψ → M The sensing advantage of using a spectator qubit of a different species than the memory qubits is then determined by the ratio SM= I QF (M θ (ψ)) (which we call the spectator multiplier) following the QCRB (please see Appendix E for a self-contained review of QFI and QCRB) ≡ 1 where n ′ ≡ SM × m indicates the equivalent number of physical memory qubit species used for sensing.In the case where our hypothesis in Eq. ( 46) holds, we can use the property of the QFI for the change of parameters [65] I QF (N which yields SM= 1/(df /dθ) 2 .Therefore, the optimal spectator, in this case, is the one that experiences an effective parameter θ eff ≡ f (θ) with f ′ (θ) = 0 at the actual value of the parameter (hence effectively yielding an asymptotic estimation regime n ′ → ∞ for a finite m).Realistically, since we do not have prior knowledge of the noise parameter θ, the ideal f (θ) will be mostly constant, i.e. f ′ (θ) = 0 (at least in the relevant variability range of θ), with f (0) = 0 and f (1) = 1.
Finally, we note that the ratio SM quantifies the relative "speed" of the dynamics between the spectator and the memory qubits, following the relation between QFI and the Bures distance between two consecutive "instances" of a channel [77].This is important for the spectator-based recovery protocol, as the timely feedforward control of the memory qubits based on the classical side information from the spectator is crucial.

Information-Theoretic Bounds
Here we consider the metrological bounds associated with the spectator-based recovery protocol, following Lemma 2 of the previous section.

Fundamental Limitations For All Recovery Protocols
Assume that we are given a noise channel N A→B .By picking We note that this is the lower bound of a two-sided bound on the diamond distance between a quantum channel and the identity channel, recently derived in [40].This lower bound holds for both recovery with perfect and incomplete knowledge scenarios in Figs. 1  and 2, respectively.Next, we consider the incomplete knowledge scenario and analyze the contribution of the spectator system to the lower bound.

Metrological Cost of Spectator-Based Recovery Protocols
Consider the scenario described in Fig. 2, which is what we expect for real-time quantum memories.The best estimate θ of the unknown θ ∈ Θ is found by the spectator system for each time-interval over which the value of the stroboscopic (slowly varying) variable θ is approximately constant.This characteristic timescale of the stroboscopic noise parameter θ should be larger than the combined characteristic times of the noisy N θ and the best-guess recovery R θ dynamics.Consequently, the relevant total dynamics of the encoded system is given by R θ • N θ .Compared to the ideal case where the noise parameter is known perfectly, the metrological cost associated with the adaptation of spectator-based quantum memories given by 1  2 R θ − R θ ⋄ , which follows from its definition.
We start by providing a lower bound to the righthand side in Eq. (54) for arbitrary finite estimation errors θ − θ, as follows Theorem 1.Consider a parameterized noise channel N A→B θ , and two arbitrary recovery maps R B→A and RB→A .Then, the difference between the corresponding entanglement fidelities of recovery is lower bounded by the Choi states of the individual quantum channels N θ , R, and R, as follows where α ∈ [0, 1) and 1/α + 1/β = 1 defines the β < 0 Hölder dual to α, and Proof.Please see Appendix F.3.
Applying this theorem particularly to the optimal R θ and best-guess R θ recovery maps, defined via the optimization in Eq. ( 40), yields a lower bound to the right-hand side in Eq. (54) for arbitrary estimation errors θ − θ.
In this article, we will mainly be interested in the small estimation error θ − θ case.Again, this is motivated by physical considerations.For example, temporal variations of T 1 and T 2 times in superconducting qubits have been studied extensively, e.g. in [16,17].It was observed that such non-negligible temporal variations occur on timescales much longer (∼ 1 second) than the timescale of a single QEC cycle.Therefore, it is sensible to consider the case where the noise parameter does not vary considerably within a single QEC cycle, and hence we expect θ −θ to also be small for each QEC cycle, and its effect only accumulates on the timescales of multiple QEC cycles, as observed e.g. in [44].Hence, in the rest of the manuscript, we provide lower bounds for the spectator-based recovery protocol in this limit, unless stated otherwise.
We now present the main result of the article, which describes a metrological lower bound to the performance of the spectator-based recovery protocol in Fig. 2, compared to the perfect knowledge case in Fig. 1.For simplicity, this result applies when no nuisance parameters are present.A discussion of how the nuisance parameters will impact the lower bound is given later is Section 5.4.

Theorem 2. Consider a parameterized noise channel N A→B θ
, and the corresponding optimal R B→A θ and best-guess R B→A θ recovery maps.For small deviations θ − θ of the locally unbiased estimate θ from the true value θ, the difference between the corresponding entanglement fidelities of recovery is lower bounded by where g(θ) denotes Further, I QF (M θ (ψ)) denotes the QFI of the spectator dynamics M θ (ψ) (initialized in state ψ) and E(•) p(x|θ) denotes the expectation with respect to the spectator's measurement statistics p X (x|θ), corresponding to the measurement outcomes x ∈ X of the spectator observable Proof.The first part of the proof follows directly from Taylor expanding the entanglement fidelity F e (R θ+ν • N θ ) with respect to the difference ν ≡ θ − θ to the second order, and using the Lagrange form for the remainder, as follows where ν 0 ∈ [0, ν] is a constant.Taking the expectation E[•] p(x|θ) of both sides with respect to the spectator's measurement statistics p X (x|θ) of the observable X, and recalling that θ is a locally unbiased estimate of θ (and hence the QCRB applies), we see that .
(61) This yields Eq. ( 56) of our theorem.To prove Eq. ( 57) of this theorem, we first show in Lemma 8 of Appendix F that the entanglement fidelity of the composite dynamics R θ+ν • N θ of the memory qubit is given by the individual Choi states of the noise and recovery maps, as follows Then, Eq. ( 57) of our theorem is a direct consequence of differentiating this entanglement fidelity formula twice with respect to ν (assuming the Choi state of the optimal recovery map R θ is twice differentiable) for a fixed θ, which yields We note that the saturation of this lower bound is based on the saturation of the QCRB, and is discussed in Appendix E.Moreover, although Theorem 2 is true for any input state ψ of the spectator system, we would like to use the optimal probe state for the corresponding dynamics M θ .Furthermore, a sufficient condition for the remainder term in Eq. ( 56) to be negligible is presented in Appendix G. Further, we note that this theorem could also be interpreted from the information geometric perspective, e.g. in [65,78].Remark 3. Due to the tensor product property of the QFI, I QF (σ ⊗m θ ) = mI QF (σ θ ), the QCRB yields zero variance only in the asymptotic limit.However, in realistic spectator-based recovery protocols (described by Fig. 2), the asymptotic limit (i.e.implementing m → ∞ spectator qubits) will necessarily mean that spatial variations of the noise parameter θ affecting the spectator qubits cannot be neglected.Therefore, we limit ourselves to the non-zero QCRB variance (finite sample case) and instead attempt to saturate this bound via optimal measurements and initial spectator state (see Appendix E.2).
Finally, note that the expected dependence of the spectator's contribution separates into the product of two functions: the first, g(θ), depends on the full dynamics of the memory system, and the second, Var( θ), depends on the full dynamics of the spectator system.It turns out that the function g(θ) could be computed analytically for simple single-qubit channels, such as for the amplitude-damping channel [79].

Comparison With The Non-Adaptive Case
As discussed previously in Section 4.1, the relative advantage of implementing a spectator-based recovery protocol depends on the characteristics of the noise.For example, if the noise is completely static, then implementing a spectator-based recovery will always be worse than simply characterizing the noise before the experiment (e.g. via process tomography), as there is no advantage to real-time quantum sensing of the noise, where only sparse data is available.Here, we provide a sufficient condition for a spectator-based adaptive protocol to outperform other (non-adaptive) recovery protocols for arbitrary finite estimation error θ − θ.
Consider some parameter variation θ n−1 → θ n from the (n − 1)-th to the n-th QEC cycle.This corresponds to the change in the noise N θn−1 → N θn .In a spectator-based recovery protocol, this change is tracked via the spectator system, and the best-guess recovery is updated accordingly R θn−1 → R θn .The performance of this protocol will hence be quantified by the entanglement fidelity F e (R θn • N θn ).On the other hand, a non-adaptive protocol will include applying a recovery map R θn−1 that is (in the ideal case) optimal for the previous noise channel, i.e.N θn−1 .The performance of an ideal non-adaptive protocol will hence be quantified by the entanglement fidelity F e (R θn−1 • N θn ).Therefore, to find a sufficient condition for adaptation to yield an advantage, we need to find a non-negative lower bound to the entanglement fidelity difference We accomplish this by using Theorem 1 and Eq.54, as follows Therefore, a sufficient (and initial state independent) condition for the spectator-based recovery protocol to be advantageous for an arbitrary estimation error θ − θ, compared to an ideal non-adaptive protocol, is given by where c ≡ Φ N θn β .It is easy to see that if there is no change, i.e. θ n = θ n−1 , then this condition is not satisfied.On the other hand, for a general (Markovian) stochastic jump model, with a θ n−1 → θ n transition probability of p(θ n |θ n−1 ), there is a range of values Θ spec ⊆ Θ of the noise parameter θ n−1 ∈ Θ such that the spectator based recovery exhibits an θ(x)  71) and (72), where H2 denotes the two dimensional Hilbert space of a single qubit system.The spectator system (first register) performs an unbiased estimate θ(x) of the noise parameter θ using the POVM {Πx}.The estimated value θ is fed into the recovery operation described in detail in [12] that is adapted for the amplitude-damping channel.
advantage.Furthermore, this advantage will accumulate over multiple QEC cycles.As an example, in Fig. 5, we plot the fidelity gain in Eq. (65) for the [4,1] amplitude damping code (see the following section), showcasing the region of outperformance of the spectator-based recovery protocol over standard QEC [12] for a stroboscopically varying noise θ n−1 → θ n .For more details, please see Appendix H.

Application to The [4, 1] Code of The Amplitude-Damping Channel
In what follows, we derive the entanglement fidelity F e (R θ •N θ ) for the [4,1] code of the amplitudedamping (AD) channel analytically, following the approach developed in [80], and extending the derivation in [79] to the incomplete knowledge recovery scenario.It is worth noting that analytical approaches to the AD channel have also been taken previously e.g. in [81,82].
Since the AD channel is covariant with respect to the group {I, Z}, the Eastin-Knill theorem [83] guarantees that no perfect QEC codes exist.However, approximate codes for the AD channel have been developed in [13] and later on, channel-adapted codes have been developed [12], where the recovery depends on the value of the noise parameter.The developed techniques have also been extended beyond the [4,1] code and towards more general [2k+1, k] codes [12,13] (where k logical qubits are encoded into n = 2k + 1 physical/memory qubits).

The Amplitude-Damping Channel
The single-qubit AD channel is defined as where The Kraus operators N 0 , N 1 are often called the "nodamping" and "damping" errors, respectively.Here, the noise parameter θ(t) = 1−exp (−t/T 1 ) depends on time t and the relaxation time T 1 [18,37].We follow the usual notation in quantum information, where the dependence of the noisy channel (and hence also the noise parameter) on time is suppressed.

The Approximate [4,1] Code
Assuming an independent noise model, we recall the encoding E : D(H) → D(C) of the [4,1] code [13] from a 1-qubit physical state to a 4-qubit logical state, where and hence E( The encoded Pauli operators σ enc = E(σ) for σ ∈ {I, X, Y, Z} read By definition, the encoded Pauli operators only act on states in the codespace C.However, in the stabilizer formalism, the logical Pauli operators I L , X L , Y L , and Z L are defined on the full 4-qubit Hilbert space.For example, the generators of the stabilizer set for the [4,1] code is given by S = {S j } 3 j=1 = {XXXX, ZZII, IIZZ}, along with the logical Pauli operators X L = XXII, Y L = Y XZI, and Z L = ZIZI.The link between the encoded and logical Pauli operators is found by restricting the action of the latter to the codespace.Namely, σ enc = E(σ) = Πσ L , where Π = 3 j=1 S j /|S| is the projection onto the codespace C corresponding to the set of stabilizers [80].
We define the noisy channel N θ to be the physical noise experienced by the four physical qubits in the is taken from Table 1 of [12], which is the channel-adapted recovery of the AD channel (see Fig. 7(b) for the performance of this recovery).

Entanglement Fidelity
In Appendix H, we analytically calculate the numerator g(θ) in Theorem 2 for the [4,1] AD code to be It has been shown that the RLD QFI for the AD channel diverges (see the example discussed in [84] for generalized AD channels), however, the SLD QFI is finite and known to be equal to [85].Consequently, a spectator qubit that satisfies the condition Eq. ( 46) has an SLD QFI of where f γ (θ) is given by Eq. (48) for the AD code (or more generally, by Eq. ( 46)).If the spectator system is made out of m qubits, then the QFI scales linearly with m due to the QFI property θ (ψ)), assuming an independent noise model (see Eq. ( 45)).Note that we can only realistically improve the QCRB to a certain degree by increasing m, without dropping the negligible spatial variability assumption of the noise parameter θ [25].
Combining Eqs. ( 79) and (80) for the [4,1] AD code with Theorem 2 yields (for small θ − θ) Therefore, the contribution of the spectator system to the entanglement fidelity of the [4, 1] code of the amplitude-damping channel is determined by two parameters: the number of spectator qubits used (m) and their physical nature (γ = T memo ).When the QCRB is saturated, the resulting entanglement fidelity is illustrated in Fig. 7 for various values of the spectator parameter γ.

The Effect of Nuisance Parameters
In this section, we discuss the effects of nuisance parameters on the lower bound in Theorem 2 for the AD code (i.e.Eq. ( 81)).We consider three different physical choices of a spectator qubit, which yield one of the following: 1.The presence of an additional constant magnetic field with noise parameter ϕ = γ spec Bt, where γ spec here is the gyromagnetic ratio of the spectator qubit.
2. The presence of an additional pure dephasing noise where the Kraus operators P 1 and P 2 are given by with noise parameter λ = 1 − exp (−t/T φ ), which yields the depahsing time T 2 with an off-diagonal decay rate of 1/T 2 = 1/2T 1 + 1/T φ [86].
When no noise parameters are present, the quantum state of the spectator system is transformed to M (1) f (θ) (ψ), where (87) However, when a nuisance parameter is present, the output state of the spectator used for quantum estimation limit of θ will be modified.In the above three cases, the spectator states are given by, respectively, Figure 7: The spectator system is taken to be a single qubit (m = 1) with varying values of the physical parameter γ in Eq. (48).Both subfigures consider the channel-adapted approximate [4,1] code of the amplitude damping channel [12].(a) Entanglement fidelity difference between the cases of perfect and incomplete knowledge recovery protocols.(b) Comparison between the entanglement fidelities for perfect [12,79] and incomplete knowledge recovery protocols.In both figures, we assume the best-case scenario where the spectator system saturates the QCRB during parameter estimation.
and M θ,q (ψ) = qM (1) The quantum estimation limit of the parameter of interest θ, in the presence of one of the nuisance parameters ζ = {ϕ, λ, q} above, is given by the partial QFIM I θ|ζ via Var( θ) ≥ 1/I θ|ζ , where and the right hand side are the block matrices of the QFIM (see Appendix E for more details) It is important to emphasize that the matrix element I θ,θ refers to the quantum estimation limit of θ, when the noise parameter ζ is known.Therefore, we expect to have θ (ψ)), which is easily verified numerically for the above three examples.However, a nuisance parameter ζ, similar to the parameter of interest θ, is, by nature, unknown.Hence, the quantum estimation limit would be reduced by I θ,θ − I θ|ζ due to the presence of the unknown noise parameter ζ, compared to when it is known.
(94) This is a direct consequence of the fact that the optimal spectator input state ψ = |1⟩⟨1| has no off-diagonal elements subject to dephasing.
3. For ζ = q, we have which also yields ).However, the partial QFI is computed to be I θ|q = 0.
6 Recovery Bounds in The Multi-Cycle Scenario

The Multi-Cycle Case
In the article, we have considered a single-cycle recovery, i.e. when the noisy channel N θ is applied only  protocol with incomplete knowledge.The input state of the quantum memory in both subfigures is given by ρ, whereas the input state of the spectator system in subfigure (b) is ψ.In the latter case, the state of the spectator is recycled back to ψ after every recovery cycle, via a discarding and preparation channel.The final output states of the quantum memory follow the multi-cycle notation in Eq. (98).
once.However, extensions to the multi-cycle regime are also important for real-time applications.A thorough study of the multi-cycle case is beyond the scope of the current article.However, here we present some useful bounds to stimulate future discussions.
To start, consider a stroboscopically varying noise parameter θ where n enumerates the recovery cycle in the multicycle protocol, executed in the time interval where ∆t R << τ θ is the duration of a single recovery cycle, and τ θ is the characteristic time of the noise parameter θ (i.e. the expected time in which the value of θ will change appreciably, see Section 4.1).The corresponding set of real-time spectator estimates of θ for these n cycles is given by We introduce the following shorthand notation for multi-cycle recovery protocols ) which is an n-cycle concatenation between the noisy channel (with changing noise parameter values in each timestep) and the corresponding best-guess recovery.

Recurrence Inequalities for Composite Average Channel Fidelity
So far, we have found a lower bound on the desired distinguishability measure in terms of the composite channel entanglement fidelity for the single-cycle case.To extend to the multi-cycle scenario, one option is to consider the entanglement fidelity of . However, a more insightful approach is to express this entanglement fidelity in terms of individual cycle fidelities.Specifically, this is accomplished by the use of the entanglement fidelities of θn−1•••θ1 and I θn θn .Bounding composite channel fidelities using individual channel fidelities has been studied previously in e.g.[43].The following lemma is largely taken from [43], using the χ-matrix representation of quantum dynamics (see Appendix I for a self-contained review).

Lemma 3. ([43]
) Given the χ-matrix elements χ Q 00 , χ S 00 of the channels Q, S, respectively, the composite channel S •Q χ-matrix element χ S•Q 00 is bounded from above (and hence the corresponding error angle δ S•Q is bounded from below), as follows The inequality is saturated iff i , and ϕ S j are defined in Appendix J in terms of the Kraus operators of Q, S and the d 2 matrix basis elements of L(H).
For completeness, the proof of this lemma is found in Appendix J.
Let us denote by . We further use the notation χ 1→n 00 , χ 1→(n−1) 00 , χ n 00 , δ 1→n , δ 1→(n−1) , and δ n to replace χ S•Q 00 , χ Q 00 , χ S 00 , δ S•Q , δ Q , and δ S , respectively.We also use the definition in Eq. ( 14) to write similar notations for the entanglement fidelities F 1→n e , F 1→(n−1) e , and F n e , in terms of δ 1→n , δ 1→(n−1) , and δ n .Therefore, we can reframe Lemma 3 by the authors of [43] as the following set of recurrence inequalities in the context of spectator-based recovery: The necessary and sufficient conditions for the saturation of this inequality are identical to that of Lemma 3.

Remark 4.
As noted in [43], the entanglement fidelity of a composite channel exhibits "constructive" and "destructive interference" with respect to the individual channel entanglement fidelities.In our case, we view the n-cycle recovery as a composite channel, where the individual channels are the (n − 1)-cycle recovery and the n-th timestep recovery.Therefore, the same phenomenon of constructive and destructive interference applies here.This is purely a multi-cycle recovery phenomenon that is not present in singlecycle recovery case, which has been the main focus of modern literature in QEC.

Contribution of The Spectator System
To identify the contribution of the lack of complete knowledge of θ to the recurrence inequalities, let us consider the error angle Theorem 3. Given the multi-cycle entanglement fidelity F 1→(n−1) e from the previous n − 1 cycles, the contribution of the spectator system to the upper bound of the total n-cycle entanglement fidelity F 1→n e is given by where with The proof of this theorem is found in Appendix K.

Application to [4,1] Code of The Amplitude-Damping Channel
The contribution of the spectator system in multicycle bounds can also be computed explicitly for the [4,1] code of the AD channel.For a fixed value of the entanglement fidelity F 1→(n−1) e (or equivalently, δ 1→(n−1) ) at the (n − 1)-th step, we can plot the total upper bound in the case of both complete and incomplete knowledge.The simplest case where the spectator system's parameters are γ = 1 and m = 1 is shown in Fig. 9.
Note that, although we expect the incomplete knowledge about the noise parameter to deteriorate the allowed values of the entanglement fidelity (as we have shown for single-cycle QEC of the AD channel in Fig. 7(b)), in the multi-cycle scenario, this can play to our advantage due to the coherence between the accumulated error during the prior (n − 1) cycles and the error due to the limited knowledge about the noise parameter at the n-th cycle (see Remark 4).This observation further supports the potential superiority of spectator-based recovery techniques in maintaining real-time quantum memories.

Comparison With Previous Literature 7.1 Relation to Quantum Information-Theoretic Protocols
In this article, we focused on the diamond distance due to its operational meaning in terms of a quantum channel discrimination task.In this context, Lemma 2 could be interpreted as a fundamental bound on the success probability of such a task.A similar bound has already been derived by Pirandola et al. in [41] using port-based teleportation [87].In fact, the bound in [41] is valid for general adaptive protocols.
Furthermore, as current techniques of quantum control have matured, the applicability of Lemma 4  is not only confined to multiple recovery rounds, as demonstrated experimentally in e.g.[44].It can also be applied in various quantum information-theoretic tasks where multiple calls to the noisy channel and adaptive feedback are allowed, such as quantum channel discrimination with adaptive feedback [41,88,89].

Relation to Robustness of Channeladapted QEC
Our approach to recovery with incomplete knowledge is closely related to the robustness of channeladapted QEC codes studied previously in literature [57][58][59][60].To elaborate, since QEC codes are designed to correct the most likely errors, an important question to ask is: how resilient (robust) is the designed QEC code with respect to some arbitrary mixing with the next-most likely errors?The authors of [58] have framed the robustness problem such that it applies both for Pauli and non-Pauli channels, as follows: One first finds the optimum recovery R of a channel N (the most likely noise) by maximizing the entanglement fidelity of R • N , and then one mixes the original channel N with some other channel N ′ (the next-most likely noise) by taking their convex combination, i.e.N µ := (1 − µ)N + µN ′ for some mixing parameter µ ∈ [0, 1].Then, the robustness of the recovery R with respect to µ is found by considering the entanglement fidelity of R • N µ and observing if it has major variations as a function of the mixing parameter µ.This setup shares some similarities with our approach, however, it has a different quantity of interest, namely the sensitivity of entanglement fidelity with respect to changes in the mixing parameter, quantified as the first derivative with respect to µ of where R µ is the optimum recovery of the mixing N µ (Also see Appendix F.2 for bounds on a similar quantity).This is to be contrasted with the quantity of interest in this article (using the parameter notation µ) Here, µ plays the role of the uncertainty ν ≡ θ − θ in the noise parameter θ, and therefore it has a different interpretation.Namely, there is no next-most likely noise in this description!Instead, µ is the random variable describing the uncertainty in the environment noise parameter and has a finite variance, by the QCRB.The possibility of including the channel uncertainty as a probability distribution p(µ) in the optimization problem of entanglement fidelity has been discussed by Fletcher in [90].The question then, as mentioned in [58], is: how to pick a physical probability distribution p(µ)?In our picture (spectator-based QEC), this question has a relatively simple answer, as one should always pick the probability distribution that maximizes the Shannon entropy with a fixed expectation and variance (larger or equal to the inverse of the quantum Fisher information).Such a probability distribution is called " the truncated normal distribution".We consider the "worst case" spectator parameters (γ = 1, m = 1).Shown are the performances of the well-known approximate QEC code in Leung et al. [13], its channel-adapted version by Fletcher et al. [12], its SDP optimized version by Fletcher et al. [14], its stabilizer-based version [12,81], and the incomplete knowledge extension of the channel-adapted QEC in [12].
Here, the difference between the "channel-adapted" and "incomplete knowledge" entanglement fidelities showcases the fundamental metrological cost of operating a real-time quantum memory.All other (γ ≥ 1, m ≥ 1) spectator-based recoveries lie above the "incomplete knowledge" graph.

Relation to [4,1] AD Code Literature
As amplitude-damping (qubit decoherence) is one of the most common noises in quantum systems, developing QEC codes for this particular noise has been a major focus of QEC literature since its inception in 1995.The simplest of such QEC codes is the approximate [4, 1] code [13].Since then, QEC methods for the AD noise have been developing in sophistication by using various new techniques, such as channel adaptation [12], stabilizer formalism [81], and semidefinite programming [14].These techniques have been steadily improving upon the entanglement fidelity of the original [4,1] code in [13].However, If we want to implement these techniques for real-time quantum memories (where the decoherence parameter is slowly varying in time), how much of the improvements upon [13] obtained in the last two decades are we likely to retain?The answer to this question, we compare the performance of the [4,1] code in the incomplete knowledge scenario with previous literature.As spectator systems are characterized by their physical nature γ ≥ 1 and the number of independent subsystems m ∈ N + (see Eq. (45)), the answer will vary from one physical implementation to another.However, we consider the above question in the case (γ = 1, m = 1).The results of this comparison are summarized in Fig. 10

and the table below
Previous literature F e to O(θ 3 ) order Leung et al. [13] 1 − 2.75θ 2 Stabilizer-Based [81,90] 1 − 2θ 2 Channel-Adapted [12,79] 1 Table 1: Comparison between the entanglement fidelities of the [4,1] code for small noise parameter value θ, for different recovery protocols (here SDP stands for "semi-definite programming").Note that in the incomplete knowledge scenario, the leading error term in the entanglement fidelity of recovery is linear in θ, as opposed to quadratic, which is the optimal result when the noise parameter θ is known apriori.
We observe that, due to incomplete knowledge of the noise parameter, the [4, 1] code of the AD channel performs suboptimally to [13] for noise parameter values below a certain threshold θ ≤ 0.17.However, beyond that point, the improvements introduced by channel-adapted recoveries and semi-definite programming techniques are preserved, as they still outperform [13], even in the presence of incomplete knowledge about θ.Furthermore, the range of the values of θ ∈ [0, 1] for which this outperformance is preserved gets larger the larger we pick γ and/or m (see Fig. 7(a)).
Let us consider one final observation.We noted in Fig. 7 that different values of the spectator parameter γ yield different regions of θ where channeladapted and semi-definite programming techniques in QEC maintain their improvements upon the approximate [4, 1] code [13], provided that a spectator system is implemented in the incomplete-knowledge recovery protocol.One might observe that, since the value of γ in Eq. (48) generally depends on the couplings of the spectator and memory systems with the environment, the only way to change γ is to change the physical implementation of at least one of these systems.However, recent quantum control techniques, such as Hamiltonian amplification [91], allow for the tuning of the coupling strengths between an environment and any continuous variable quantum system, with a quadratic coupling Hamiltonian.Therefore, provided that the implementation of either the spectator or memory system has continuous degrees of freedom [92], the Hamiltonian amplification technique yields a practical advantage for spectator-based recovery architectures, as the resulting entanglement fidelities can be manipulated in experiments for any desired region of the noise parameter θ, as seen in Fig. 7(a).

Relation to Time-Dependent QEC
In [93], the author suggests that knowledge of the error rates for Pauli channels is not the most useful side information in QEC.Indeed, as mentioned pre-viously (see Remark 2), the assumption that the optimal recovery channel (defined by Eq. ( 40)) depends on the environment noise parameter does not hold for Pauli channels.Nevertheless, it is important to note that optimization-based techniques of QEC for Pauli channels do generally benefit from the knowledge of the noise parameter.This is especially relevant when error identification from syndrome measurements is not unique (e.g. in surface codes [94]).Hence, one can only construct (suboptimal) decoders, rather than the optimal recovery map in Eq. (40).This generally yields decoders that depend on the noise parameters, even for Pauli channels.For example, various types of decoders exist for both repetition codes [95,96] and surface codes [97], where under the presence of a drifting noise parameter, one can design an adaptive decoder that can track this drift while not interrupting the QEC protocol.Therefore, the results of this article could be expanded to include adaptive decoders for repetition and surface codes, rather than the optimal recovery map defined in Eq. (40).Finally, it is worth mentioning that other approaches to adaptation in QEC literature have been pursued, e.g. in Refs.[98,99].

Conclusion and Open Questions
In this article, I consider the problem of building a real-time (drift-adapting) quantum memory and present it as a spectator-based recovery protocol.To counter noise drift, the spectator system performs a real-time parameter estimation (generally in the presence of nuisance parameters) and feeds forward this classical side information to the "best-guess" recovery map.To quantify the single-cycle informationtheoretic cost of adaptation in real-time quantum memories, I compute a lower bound for the diamond distance between the optimal (inaccessible) and bestguess (accessible) recovery protocols.This approach is generalized in Appendix A for other relevant distinguishability measures between arbitrary two quantum channels.For slowly drifting noise parameters, I show that a metrological bound exists, determined by the quantum Fisher information of the spectator dynamics.This bound is demonstrated for the [4,1] code of the amplitude-damping channel, and the effects of various physical choices of spectator qubits and nuisance parameters are discussed.Finally, for multi-cycle recovery, I recall a theorem in [43] and use it to derive an upper bound to the fidelity of multi-cycle recovery in terms of recurrence inequalities.The contribution of the lack of knowledge of the noise parameters (i.e.noise-drift adaptation) is also derived.This is also showcased for the [4,1] code of the amplitude-damping channel, and regions of outperformance in the spectator-based recovery protocols are highlighted.The advantages of spectator-based recovery compared to non-adaptive recovery proto-cols, even in the perfect knowledge scenario, is due to the coherence of errors from different cycle numbers as well as the imperfect knowledge (noise estimation) error.This phenomenon is exclusive to multi-cycle QEC.
The results mentioned above are relevant for various research communities, such as quantum error correction, quantum communication, quantum information, quantum control, and quantum computing.To elaborate, the existence of lower bounds on any channel recovery (Eqs.( 53) and (54), or more generally in Theorem 4 in Appendix A) could be valuable in testing the performance of various optimization-based techniques in QEC to determine if optimal performance is reached.As discussed in Section 7, these bounds may also have a broader interest in various domains of quantum information as they hold for any generalized distinguishability measure and between any two quantum channels (see Appendix A).Multicycle bounds (Lemma 4) might also be useful in adaptive quantum information-theoretic protocols, where many calls to the noisy channel are made.The analysis made for the [4,1] code of the amplitude-damping channel sheds light on what to expect when implementing such QEC codes in real-time quantum memories [23,24], while also providing an excited avenue in terms of outperformance in the incomplete knowledge scenario for multi-cycle recovery, which is quickly starting to become a reality [44].Finally, implementing novel quantum control techniques, such as Hamiltonian amplification for continuous quantum systems [91], might prove useful in controlling the coupling strength of the spectator system.Therefore, one can experimentally optimize over the selection of all possible spectator system parameters without physically changing the spectator system.
Many questions are left open: • Extension of the information-theoretic and metrological lower bounds to Pauli channels with suboptimal decoders.This is relevant for current surface codes, as the optimal recovery map Eq. ( 40) is inaccessible, due to the probabilistic nature of error identification from syndrome measurements for high error rates [94].
• Incorporation of important theoretical techniques of the maximum overlap problem in quantum information theory, e.g. the two-sided bounds by Tyson [100,101] using directional iterates.This approach has led to various important results previously [49,102].An interesting proposal (potentially also in quantum metrology) would be to apply the directional iterate technique to the semi-inner product used to define the QFIM, given in Eq. (196) of Appendix E.
• A deeper analysis of the multi-cycle regime is required.This includes, but is not limited to, the study of optimal conditions for achieving the coherent error cancellation, as well as incorporating techniques from asymptotic quantum information theory to gain further insight into the multi-cycle case.
• Continuous (dynamical) recovery using Petz recovery maps [56] could also be considered, as well as other adaptive approaches implementing Petz recovery maps [103,104].This can potentially extend the temporal range of applicability of spectator-based recovery protocols to faster varying noise.Another interesting dynamical model of real-time quantum memory could be constructed from the open system theory of two subsystems (memory and spectator) with slow and fast dynamics, relative to the environment noise.This has been studied previously in the context of adiabatic elimination in bipartite open quantum systems [105].
• Considerations of spatial variability of the noise parameter are also needed for scalability of the spectator-based recovery protocols [25,106].Generalization beyond the independent noise model in Eq. ( 45) is also of relevance.
• Finally, one would be remiss by not considering the large literature on approximate recoverability of quantum states, see e.g.[50,51,[107][108][109] and references within.It is interesting to see whether incomplete knowledge recovery protocols would benefit from a similar approach.
A Lower-Bounding Generalized Distinguishability Measures Using Entanglement Fidelity

A.1 Generalized Distinguishability and Distance Measures
To quantify the success of a recovery protocol (such as QEC), we need to introduce the concepts of generalized distinguishability and distance measures between two states as well as between two channels [110][111][112][113][114][115].
We say that D : D(H) × L + (H) → R 1 is a generalized distinguishability measure between two states if it satisfies the data-processing inequality (DPI), i.e. for arbitrary Q CPTP map and all ρ, σ ∈ D(H), we have An important consequence of DPI is the property of isometric invariance.Namely, for any isometry V , the following holds [115] D(V(ρ), V(σ)) = D(ρ, σ) , (110) where Independently, we say that D : D(H)×D(H) → R 1 + is a generalized distance measure between two states if it satisfies the following three properties for all ρ, σ, τ ∈ D(H): 1. Positivity and faithfulness: where the equality holds iff ρ = σ.

Triangle inequality:
1 Some previous papers have used the notation D(ρ∥σ), rather than D(ρ, σ), to indicate the generalized distinguishability measure between ρ and σ.Here, we use the latter notation to emphasize the role that D plays also as a distance measure in deriving standard upper bounds in the context of QEC.Please Appendix C for more details.
A common requirement for fault-tolerant QEC and quantum computing is the so-called "chaining property" [110,112].However, this property of generalized distinguishability/distance measures is derivative from more elementary properties, such as DPI and the triangle inequality (see Appendix B for a short discussion).
Finally, we say that the map D : D(H) × L + (H) → R 1 satisfies the joint convexity property if for any two ensembles {p X (x), ρ x } x∈X and {p X (x), σ x } x∈X , where p X is a probability distribution function of the random variable X over the set X , we have For fidelity-based distinguishability measures, such as the Bures and Sine distances, this directly follows from the double concavity of the fidelity function (see e.g.[115]).
Alternatively, it is well-known that the joint convexity property can be derived from the DPI (with respect to the partial trace channel) if we further assume that D satisfies the direct sum property for classical-quantum states [115], i.e.
For a summary of various distinguishability and/or distance measures, as well as which properties they satisfy, please see Table 2.All the above properties are satisfied by [112,115] 1. Trace Distance: D Tr (ρ, σ) = 1 2 ∥ρ − σ∥ 1 .

Sine Distance: D
where is the fidelity function.Using the generalized distinguishability (distance) measures between two states, we define the generalized distinguishability (distance) measures between two channels Q A→B and S A→B , as follows (115) where ρ ∈ D(H A ⊗ H R ), for arbitrary Hilbert space dimensions of the reference system R.By using joint convexity and the Schmidt decomposition of pure states, it can be shown that the maximization need only be taken over pure states ψ RA , with the reference system R having the same Hilbert space dimensions as A [115], i.e.

D(Q, S)
Finally, it is important to note that the joint convexity property of generalized distinguishability measures for states implies the same property for channels.This is seen by considering the two channels , and then applying the joint convexity property for states, as follows Consequently, we have the joint convexity property

A.2 Unitary t-designs
We call a function P : U(d) → C acting on any unitary U in U(d) to be polynomial of degree t if its dependence on the 2d 2 real entries of U is a polynomial of degree at most t in each of its entries.Given a finite set of unitaries {U (x)} x∈X in U(d), we say that they form a unitary t-design [54,117] if the uniform Haar average over U(d) of any polynomial P of degree t is computed using the uniform average over the finite set {U (x)} x∈X only, as follows It has been shown that for unitary 1 and 2 designs, the above averaging condition can be rewritten in a different form.We say that {U (x)} x∈X forms a unitary 1-design in U(d) if for all ρ ∈ D(H), where π = I/d is the maximally mixed state.An example of unitary 1-designs is given by the Pauli group.Further, we say {U (x)} x∈X forms a unitary 2-design in U(d) if we have the following conditions for twirling of states or channels [54,118] U(d) for all ρ ∈ D(H ⊗ H), or equivalently for all ρ ∈ D(H) and quantum channels Q.An example of unitary 2-designs is given by the Clifford group [118,119].
Remark 5. Note that, if {U (x)} x∈X is a unitary tdesign, then it also holds that {U (x)} x∈X is a unitary (t − 1)-design.For example, the Clifford group forms a unitary 3-design, and hence also a unitary 2-design.

A.3 Channel Twirlings
Generally, channel twirlings can be defined with respect to both discrete and continuous sets of unitaries.In its most simple form, for a set of unitaries {U A (x), V B (x)} x∈X and a probability distribution function p X defined over a finite set X , the twirling of a quantum channel Q A→B (which we denote by a tilde symbol QA→B ) is defined as where we have used the notation for the unitary channels , for all x ∈ X .Although most of the results presented in this article are valid for any finite set X , the case where it forms a group and {U A (x), V B (x)} x∈X two unitary representations of it are of great interest [114] (see Remark 7).
Twirlings with continuous sets of unitaries have also been studied extensively in the literature.If we have some probability distribution (measure) µ(U ) over the set of d×d unitary matrices U(d), then the continuous twirling of the channel Q A is defined to be Twirling of quantum channels plays an important role in QEC and fault-tolerant quantum computing [54,[119][120][121][122][123][124].Examples include: (1) similarities between QEC codes for channels and their twirled versions [121], (2) the simulability of twirled quantum channels on a quantum computer, due to the Gottesman-Knill theorem [125], (3) the fact that channels and their twirled versions share the same average and entanglement fidelities [47], (4) various twirlings (Pauli, Clifford, and uniform Haar) rendering channels depolarizing [47,54,119,126], (5) and finally, their close connection to unitary t-designs.Due to its importance, I recall some relevant properties of unitary t-designs in Appendix A.2 (also see [118,122] for a brief review).

A.4 Lower-Bounding Generalized Distinguishability Measures Using Entanglement Fidelity
We start this section by showing a simple property that all generalized distinguishability measures satisfy with respect to channel twirling if the joint convexity property (or equivalently, if the direct sum property) is further assumed.Lemma 5. Assume that we are given two CPTP maps Q A→B and S A→B , a set of unitaries {U A (x), V B (x)} x∈X , and a probability distribution function p X defined over the finite set X .If the generalized distinguishability measure D satisfies the joint convexity property, then D(Q, S) is lower bounded by the generalized distinguishability measure between the corresponding twirled channels QA→B and SA→B with respect to the given weighted set of unitaries above, as follows where the lower bound is saturated iff the joint convexity of the generalized distinguishability measure between the two quantum channels is saturated with respect to the above set of weighted unitaries.
Proof.Consider the isometric invariance property of D(Q, S), namely for any U := U (•)U † , where U ∈ U(d), we have where the first equality follows from Eq. ( 110) and the second equality follows from the definition in Eq. ( 115).This implies that for all U A ∈ U(d A ) and for all Consequently, by considering the generalized distinguishability measure D( Q, S) between the twirled channels, we arrive at where the inequality follows from Eq. ( 122).■ Remark 6.This lemma can be viewed as a special case of a more general result for quantum supermaps.
To elaborate, we recall that a supermap (a linear map from one quantum channel to another) can always be expressed as a pre and post-processing maps concatenated with the input quantum channel, and assisted by a memory [127].Then, Lemma 5 follows from applying the data-processing inequality for generalized distinguishability measures between two quantum channels [55] with respect to channel twirling, which is a valid quantum supermap.

Remark 7.
In [114], the authors have shown that for any two covariant channels F A→B and G A→B with respect to {U A (x), V B (x)} x∈X (namely that V x • F = F • U x for all x ∈ X , and similarly for G), the generalized distinguishability measure ) can be found by maximizing only over symmetric states ϕ RA , defined as However, since the twirlings F ≡ Q and G ≡ S in Lemma 5 are trivially covariant with respect to the unitary representations {U A (x), V B (x)} x∈X of the finite group X , this implies that the lower bound in Eq. (129) need only be computed for such symmetric states.Furthermore, if {U A (x)} x∈X is a unitary 1-design (i.e. it is an irreducible representation of the group X of degree-d A ), then, using the property Eq. ( 124) of unitary 1-designs, the maximization is found by computing the generalized distinguishability measure exactly for the maximally entangled state So far, we have shown that the generalized distinguishability measure between Q A→B and S A→B is lower bounded by the corresponding distinguishability measure for arbitrary discrete twirlings of these channels.We now show that a similar lower bound can be derived for the uniform Haar twirling.But first, we recall the following important result Lemma 6. ( [47]) Given a CPTP map Q A→A and for all ρ ∈ D(H A ), the continuous twirling where the depolarizing parameter p Q is given by the average fidelity of Q, as follows The proof of Eq. ( 139) is shown in [47,126] for some parameter value p Q .Eq. ( 140) is a direct consequence of the fact that the uniform Haar twirled channel Q = U(d) dU U † •Q•U has the same average fidelity as the original channel Q [47], along with the fact that the average fidelity of the depolarizing channel is given by where we have used the normalization dψ = 1 and the notation p for the depolarizing parameter.Using the above Lemmas 5 and 6, we now establish a similar lower bound to that in Lemma 5 for the uniform Haar twirl.Theorem 4. Assume that we are given two CPTP maps Q A→A and S A→A .If the generalized distinguishability measure D satisfies the joint convexity property, then D(Q, S) is lower bounded by some function l D of the channel entanglement fidelities F e (Q) and F e (S), as follows where the specific form of the function l D depends on the choice of the generalized distinguishability measure and is determined by the uniform Haar twirls, as follows where The inequality is saturated if the joint convexity property is saturated for a set of unitary 2-designs and a uniform probability distribution over this set.
Proof.This is a direct consequence of applying Lemma 5 to any unitary 2-design {U A (x)} x∈X , e.g. the unitary representation of the Clifford group (see Appendix A.2), along with a uniform distribution on X , and then using the property of unitary 2-designs in Eq. ( 126), which finally yields where The proof is completed by applying Lemma 6 and plugging in the depolarizing channels into the lower bound in Eq. ( 144).■ It directly follows from this proof that the image of the function l D coincides with the image of the corresponding generalized distinguishability measure D.
Remark 8.The lower bound proof does not require faithfulness, symmetry, nor the triangle inequality, which would also make D a generalized distance measure.However, the triangle inequality becomes necessary when deriving an upper bound for the generalized distinguishability measure for concatenated noisy channels (or gates), as it is relevant to fault-tolerant quantum computing (see Appendix C for more details).

B Comment on The Chaining Property
In quantum computing literature, one encounters the chaining property for distance measures [110], which is useful for computing upper bounds on error propagation in fault-tolerant quantum computing.This property is framed as follows: Assume we want to apply two maps Q and S in series, however, we only have access to their noisy versions, which we denote by Q ′ and S ′ , respectively.If the generalized distance measure D also satisfies the DPI (i.e.D is also a distinguishability measure), then the chaining property reads for all ρ ∈ D(H).This is interpreted by saying that the error due to a consecutive application of two faulty channels is no larger than the sum of the errors of applying each of the faulty channels separately.The proof follows by first applying the triangle inequality to the left-hand side of the above inequality, followed up by the date-processing inequality.Therefore, the desirable chaining property is derivative from other, more fundamental, properties of D.

C Upper-Bounding Generalized Distance Measures for State Recovery
Here we present upper bounds on generalized distance and distinguishability measures, showing how they get modified when limited knowledge about the noise parameter θ ∈ Θ is available, both for the singlecycle and multi-cycle cases.Similar to the chaining property, the derivation of upper bounds on generalized distance and distinguishability measures is important for the analysis of error propagation in noisy quantum processes.

C.1 Single-Cycle Case
Consider the distance measure D and assume that for all ρ ∈ D(C) ⊆ D(H), approximate recovery from the noise N θ is possible in the presence of perfect information about θ, i.e. there exists R θ such that Now consider the distance measure D(I θ θ (ρ), ρ), where θ is the best unbiased estimate of θ ∈ Θ.Our goal is to bound this quantity from above by two terms: the first depends on how well we can bound the same distance measure when given perfect knowledge about θ (see Eq. (146)), and the second should measure our lack of knowledge of the noise parameter θ.This intuition is validated by applying the triangle inequality, as follows where the second inequality follows from the DPI of D, the third follows from the definition of the generalized distance for channels, and the fourth from the assumption of Eq. (146).It is worth noting that one can derive a similar upper bound using the recoveries, rather than the noisy channels.The advantage of this approach is that we do not need to assume that D satisfies the DPI, i.e. it suffices for D to be a distance measure.To see how we simply apply the triangle inequality where we have used the definition of a distance measure between channels for the second inequality, as well as Eq.(146).We will shortly show that DPI becomes necessary when considering the multi-cycle case.

C.2 Multi-Cycle Case
Let us now extend the upper bound previously derived in the single-cycle case to adaptive multi-cycle recovery.Using the shorthand notation where ) and applying the triangle inequality, we get The second term could be bounded from above by the individual errors {ϵ θi } n i=1 , using only the triangle inequality, as follows We assume that so that the n-th step approximate recovery with perfect knowledge of θ would be possible, in principle.This leads to = D(I θn θn Substituting this result back into Eq.(156), we get and repeating the above two steps yields The first term in Eq. ( 155) is a new error term due to the real-time (drift-adapting) nature of our setup.This term can be bounded from above using the chaining property and the DPI, as follows Repeating the above steps n − 1 times, we arrive at the upper bound (166) Combining Eqs.(161) and (166) with Eq. (155), we get which generalizes Eq. (150) for real-time approximate recovery.This result says that, if AQEC is possible in principle (see Eq. (157)) when perfect knowledge of θ is available, then AQEC is also possible when knowledge about θ is limited.As we have shown, this holds for both the single-cycle and multi-cycle regimes.
Alternatively, we can derive an upper bound that is a function of the recoveries, rather than the noisy channels.This is accomplished as follows where the second term is similarly bounded from above by n i=1 ϵ θi , based only on the triangle inequality (see Eq. (161)).We now upper bound the first term in the above inequality as where we have used the triangle inequality for the first inequality and the DPI and the definition of generalized distance measure between channels for the second inequality.By repeating these two steps n − 1 times, we arrive at Consequently, Eq. (168) yields and {|µ⟩} is a basis set of the Hilbert space H S .
Proof.We consider the dynamics of the combined memory-spectator (M S) system, and express it in terms of the Choi matrix of the mother channel Z M S θ , as follows [115] then the reduced dynamics of the spectator system yields Therefore, for the reduced dynamics to be independent of the noise parameter θ α for some α = 1, • • • , p and any joint input state ρ M S , we must have for all θ ∈ Θ p .We now derive a necessary and sufficient condition for this equality to hold, in terms of the Choi matrix Γ Z θ M S,M ′ S ′ of the mother channel.We start by recalling the formula for the Choi matrix of the composite channel in terms of the Choi matrices of the individual channels [115] Γ We now compute the basis dependent matrix in the separable memoryspectator basis |i⟩ M S ≡ |i(a, µ)⟩ M S = |a⟩ M |µ⟩ S of the Hilbert space H M ⊗ H S , as follows where |Γ⟩ S ′′ S ′ = µ |µ⟩ S ′′ |µ⟩ S ′ is the maximally entangled state in the special spectator basis {|µ⟩}.In the same {|a⟩⊗|µ⟩} basis, the partial transpose yields where Γ T S ′′ S ′ is the Choi matrix of the partial transpose channel.
It has been shown in citejohn-ston2011quantum that the Choi matrix Γ Substituting back into Eq.(180) leads to Therefore, Eq. ( 177) holds if and only if The last equation in the proof can be equivalently written as (using the identity P sym S ′′ S ′ + (P sym S ′′ S ′ ) ⊥ = I S ′′ S ′ ) Remark 9. Note that the condition Eq. 191 is weaker than for all θ ∈ Θ p , which holds when the mother channel Z M S θ itself does not depend on the noise parameter θ α , and hence trivially also the reduced channel

E Quantum Fisher Information Matrix
We review the relevant definitions and results regarding quantum Fisher Information Matrix (QFIM) and the partial QFIM, following [63,65].

E.1 Useful Definitions
For a family of parameterized quantum states {ρ θ } θ∈Θ with a p-dimensional parameter space Θ p ⊆ R p , we define the symmetric inner product between two linear operators A and B, with respect to the parameterized family of states, as where {a, b} := ab + ba is the anti-commutator.The symmetric logarithmic derivative (SLD) is a Hermitian operator L θ;i that is defined by the solution to the Lyapunov type equation [128] ∂ ∂θ α ρ θ =: The SLD QFIM corresponding to the parameterized family of states is defined as the p × p matrix Next, assume that a quantum measurement of an observable X is performed on ρ θ , described by a POVM Π ≡ {Π x } x∈X .This yields the statistics p X (x|θ) = Tr[ρ θ Π x ] for the measurement outcomes x ∈ X .We define an estimate θ : X → Θ p as a mapping from the set of measurement outcomes to the parameter space.We say that the pair (Π, θ) is an estimator, and call it unbiased if In general, such an estimator does not exist for all θ ∈ Θ p .Instead, it is customary to use a weaker condition on our estimator, namely that it is locally unbiased.This is defined as follows: at a fixed θ, we require that the following two conditions are satisfied E θ(X) and (201) Finally, we define the mean-square error (MSE) matrix corresponding to an estimator (Π, θ) as follows , (202) with the (α, β) entry of this matrix given by . (203)

E.2 Saturation of QCRB
The quantum Cramér-Rao bound (QCRB) provides a lower bound to the variance matrix defined in Eq. (202) using the QFIM in Eq. (198) [63,129] Var The QCRB holds for any locally unbiased estimator, and is a direct consequence to applying the Schwartz inequality for the inner product defined in Eq. (196).
Here, we are interested in the saturation condition for this inequality.A necessary and sufficient condition for the saturation of the multi-parameter QCRB is given by (see e.g.[65,130]) (205) To design the optimal measurements for the saturation of the multi-parameter QCRB, we conduct the following: (i) find an SLD {L θ;α } that mutually commute, (ii) using the matrices I θ and {L θ;α }, construct the (commuting) linear combinations Lθ;α := for all α = 1, • • • , p, and (iii) write the spectral decomposition of the mutually commuting operators for α = 1, • • • , p, where {P i } are the projectors onto the simultaneous eigenspaces of { Lθ;α } (or equivalently for {L θ;α }).Then, the QCRB is saturated if we pick the locally unbiased estimator (Π, θ) to be [63] Π x ≡ P i=x , (208) θα (x) ≡ θ α + l αx . (209) In the case of a single parameter family, this yields the locally unbiased estimator where p(x|θ) = Tr[ρ θ Π x ], which explicitly depend on θ.Although the optimal measurements described above saturate the QCRB, they require prior knowledge of the noise parameters, which defeats the point of implementing spectator systems.In the singleparameter case, this can be remedied.

E.2.1 Parameter-Independent Estimator
Nagaoka has shown that, in the single parameter case, a θ-independent locally unbiased estimator exists that saturates the QCRB [130].This is possible only for an exponential family {ρ θ } θ∈Θ of parameterized states [131]: θ.As such, the SLD of this parametric family is given by L θ = ψ(θ)(O − θ), which guarantees that the Schwartz inequality for the two vectors L θ and O − θ is saturated, and hence the saturation of the QCRB [129,130].The optimal measurement POVM (as described above) is given by the (parameterindependent) eigenvectors of O. Therefore, we see that achieving exponential family of states, as defined in Eq. (211), for the output states ψ → M θ (ψ) of the spectator system is generally helpful for our application.Finally, note that for non-full rank parameterized density matrices, the optimal measurements described above are not unique.

E.2.2 Maximum Likelihood Estimator
The maximum likelihood estimator θMLE corresponding to the choice of POVM Π ≡ {Π x } x∈X is defined as where p(x|θ) = Tr[ρ θ Π x ] for all x ∈ X .Although the above definition seems natural, the MLE is known to be a biased estimator for a general parametric family {ρ θ } θ∈Θ .However, the MLE becomes unbiased, and further, saturates the classical CRB in the asymptotic limit [132].We recall that a necessary condition for the saturation of the QCRB is that the classical and quantum Fisher informations must coincide [65].Hence, the MLE is also relevant for the asymptotic saturation of the QCRB.In the context of our article, the asymptotic limit necessarily implies that the spatial dependence of the noise parameter cannot be neglected, as we are performing quantum parameter estimation on a large number of spectator qubits that must be spatially distributed within the quantum memory device.Therefore, to retain the spatial homogeneity assumption of the noise parameters used in the main text, we refrain from considering the asymptotic saturation of the QCRB.Hence, the MLE choice is inappropriate within the context of our manuscript, as it is a biased estimator in the non-asymptotic regime.
where the upper block diagonal matrix I θ I,I is p I × p I and the lower block diagonal matrix I θ N,N is of size p N × p N .We also write the inverse of the SLD QFIM in a similar block form which is a modification of the standard QCRB when nuisance parameters are present.

F Choi Matrix Methods for Entanglement Fidelity of Composite Parameterized Channels
In what follows, we present a useful lemma for the entanglement fidelity of composite parameterized channels and then dedicate the rest of this appendix to demonstrating its wide range of applicability in the context of the main text.

Lemma 8. The entanglement fidelity of the compos-
where T indicates matrix transposition.
Proof.By definition, the entanglement fidelity of the composite channel can be written in terms of its Choi matrix, as follows where H A and A ′ are isomorphic Hilbert spaces.One can easily verify that the Choi matrix of the composite channel can be written in terms of the Choi matrices of the individual channels, as follows [115] Γ where T B is the partial transpose defined with respect to the same basis as the maximally entangled state |Γ⟩.By substituting this form into the entanglement fidelity formula, we arrive at Next, we make standard simplifications for any Λ AB : Tr This yields for Λ AB ≡ T B Γ N AB the following Substituting back into the entanglement fidelity formula completes the proof.■ Therefore, according to this lemma, the entanglement fidelity of the memory dynamics is generally written as follows for any recovery map In particular, this implies that we can search for a recovery map R ϕ that is parameterized by some number of parameters ϕ and maximize over them (for a fixed θ) to arrive at an optimal choice ϕ(θ), see e.g.[90] in terms of the natural representation of quantum channels (which is related, but not the same as, the Choi representation adopted in this article).

F.1 Zeroth Derivative: Hölder Type Upper Bounds
A useful upper bound on the entanglement fidelity in Eq. (40) of the quantum memory dynamics can be given in terms of the Choi state of the recovery map, by applying the Hölder inequality [115], as follows

F.2 First Derivative: Bound on Robustness of a Recovery Map
Consider, for a given recovery map R, the partial derivatives of the entanglement of fidelity in Eq. (234) with respect to the components of the noise parameter vector θ.Using the inner product definition in Eq. (196) and the definition of SLD operator for the Choi state of N θ , as given in Eq. (197), we have for the partial derivatives Using the fact that the robustness of R (which is the derivative of the entanglement fidelity with respect to the parameters pertaining to N θ ) is written in terms of the inner product in Eq. (196), we can use the Cauchy-Schwartz inequality to arrive at an upper bound, as follows which yields the upper bound 1/α , where α ∈ [1, ∞), are often used to bound trace quantities, via the well-known Hölder inequality [115] ∥XZ∥ where 1/α + 1/β = 1 (a pair (α, β) satisfying this equality is called an Hölder pair).It is easy to show that the following statement also applies [115] | Tr For α ∈ [0, 1), the Schatten norm ∥ • ∥ α is no longer a norm (e.g. it does not satisfy the triangle inequality).However, if Z > 0 (along with 0 ≤ α < 1), then the above inequality is reversed [133] (the Hölder dual β becomes negative) We can use this inequality to find a lower bound on the difference between the entanglement fidelities of any two recovery maps R B→A , RB→A for the parameterized noise channel N A→B θ , as follows Next, we use a theorem relating the Choi matrix of a quantum channel Q A to its adjoint [134] where T is the partial transpose channel, and its Choi matrix yields a SWAP unitary [134].Even though ∥ • ∥ α is not a norm for α ∈ [0, 1), its definition is still invariant with respect to a unitary transformation and transposition [115].This yields Substituting back into Eq.(252) yields the inequality in Theorem 1.

G Sufficient Condition For a Negligible Remainder Term in Theorem 2
To quantify the "smallness" of θ − θ, for the remainder term in Eq. (56) of Theorem 2 to be negligible (given the noise channel N θ ), we first use Eq. ( 62) to arrive at a useful bound, as follows: where we have used the Hölder inequality [115], the invariance of the trace norm ∥•∥ 1 under transposition, and the unit trace of the Choi state Φ N θ AB , in the third, fourth, and fifth lines, respectively.Then, the above bound implies that the remainder term in Eq. 56 of Theorem 2 is negligible if the following sufficient condition holds .
(260) We can rewrite this condition as where the matrix elements θ,σ , describes the effective dynamics of the Bloch coefficients for the encoded single qubit [80] In [79], the authors derived an analytical formula for the entanglement fidelity F e (R θ • N θ ) as where τ = 1 − θ and α, β (with |α| 2 + |β| 2 = 1) are the complex parameters that the recovery channel depends on.The optimum recovery channel R θ (α, β) = R(α(θ), β(θ)) in [12] is the one that maximizes the entanglement fidelity with respect to α, β for the given value of the noise parameter θ.
(265) By simple substitution, we can check that (ψ, ϕ) = (0, 0) is the pair that maximizes the entanglement fidelity function.Now let us find an analytical formula for the incomplete knowledge scenario F e (R θ • N θ ).Note that in Eq. (263), the dependence of the recovery R θ (α, β) = R(α( θ), β( θ)) on the estimated noise parameter θ enters only through α and β [12].Therefore, we can use the optimum values |α opt ( θ)| = 1/ 1 + τ ( θ) and ψ opt = ϕ opt = 0 and plug it back into Eq.(263), which yields (266) This yields for arbitrary finite differences θ − θ and any estimate θ the following exact formula When adaptation is implemented, the following derivative is relevant (268) On the other hand, if no adaptation is implemented, the relevant derivative becomes where q(x, y) ≡ which satisfies q(x, 1) = 1 in the adaptive regime η = θ.Therefore, further algebraic simplifications yield is the Lagrange form of the Taylor series expansion remainder of F e (|α opt (θ + ν)|; θ) with respect to ν, where ν 0 ∈ [0, θ − θ] is a constant.Finally, taking the expectation of both sides with respect to the spectator system's probability distribution function p X (x|θ) (where x is the measurement outcome of the spectator observable X = x∈X xΠ x ) yields where and we have used the fact that θ is an unbiased estimate of θ.

I χ-Matrix Representation of Quantum Channels
Besides the well-known Kraus and Stienspring representations of a CPTP map, a lesser-known representation, called the χ-matrix representation [135], is also useful in practice.This is most commonly used in quantum state tomography [37] and is extended to quantum process tomography [136] where state tomography of the Choi state of a quantum channel is conducted.This is to be contrasted with other approaches in measuring noise, such as randomized benchmarking [137] and QEC itself [138].Interestingly, the χ-matrix representation can be well motivated in the context of QEC by noting that we can rewrite the "error operators" {Q i } of any noisy map , where d ≡ dimH.It is particularly useful to pick one of the basis elements, e.g.B 0 , as the "desirable" error (such as being proportional to the unit matrix).Consequently, the coefficient associated with this error component indicates how likely it is that the given Kraus operators of the noisy map will change the state of our quantum system in a "desirable way".An additional benefit of the χ-matrix representation is that the effects of channel twirling are especially clear [122], as "diagonalization" with respect to the generalized Pauli group.Therefore, the rest of the appendix is devoted to recalling the χ-matrix representation in a self-contained way.
Recall that every CP map Q A→B admits a Kraus decomposition in terms of Kraus operators , where the equality holds for TP maps.Let us consider a CP map Q A→A ≡ Q, where by denoting d ≡ dim(H A ), we can decompose each of the Kraus operators {Q i } K i=1 as a linear combination of some orthonormal operator basis {B k } d 2 −1 k=0 in L(H A ), as follows where ⟨B k , B l ⟩ = δ kl , and ⟨•⟩ being the Hilbert-Schmidt inner product in L(H A ).One could take B k ≡ B (m,n) = |m⟩⟨n| for m, n = {1, • • • , d}, which is known as the standard basis in L(H A ). Substituting Eq. (275) in the Kraus representation of Q, we get where is called the χ matrix of the CP map Q.It is easy to see that the χ matrix is a positive semi-definite matrix.This matrix has d 4 complex entries, corresponding to the matrix entries of the superoperator Q in the Liouville representation (see, e.g.[43,139]), namely where |B k ⟩⟩ is the d 2 × 1 vector corresponding to the d × d matrix B k .The number of independent entries of the χ matrix is reduced from d 4 to d 4 − d 2 complex numbers if the CP map Q is also TP, since for each of the d 2 standard basis elements |n⟩⟨m| in L(H A ), the map Q must also preserve the trace, which leads to d 2 constraints.
In the context of QEC, it is convenient to choose our operator basis in L(H A ) such that B 0 indicates a "desired effect" on a quantum state.Here B 0 ≡ I/ √ d is desirable for QEC, but for other applications, B 0 could be chosen differently.Next, we write Eq. (275) for a fixed i = 1, • • • , K, as Then, by multiplying both sides on the left by Q † i and taking the trace, we arrive at or equivalently, By denoting q 2 i := ⟨Q i , Q i ⟩, | cos(ϕ i )| := |⟨B 0 , Q i ⟩|/q i , and v i,k | sin(ϕ i )| := |⟨B k , Q i ⟩|/q i with some real weights {v i,k } d 2 −1 k=1 satisfying d 2 −1 k=1 v 2 i,k = 1, we can rewrite the previous equation in a simple form cos 2 (ϕ i ) + sin 2 (ϕ i ) = 1 for for all i = 1, • • • , K , (282) where the angle ϕ i indicates how "close" the error Q i is the the "desirable error" B 0 .Note that, if Q is also TP, then Due to the unitary freedom of choosing the Kraus operators of any fixed quantum channel from its Steinspring dilation [37], the χ matrix is not uniquely determined.Therefore, using the phase freedom Q i → Q i e iωi (which is a special case of the unitary freedom mentioned above), we can always choose the phases ω i for i = 1, • • • , K such that all the inner products ⟨B 0 , Q i ⟩ with the basis element B 0 are all nonnegative.This means that we can pick ϕ i ∈ [0, π/2].Finally, the additional phase in ⟨B k , Q i ⟩ can always be placed in the vector v k , which leads to the final decomposition where {v i,k } d 2 −1 k=1 are now complex numbers with which is consistent with χ Q 00 /d = F e (Q, I/d) [7,8].Using Eq. (283) to compute Q † i Q i , taking the trace, and using the orthonormality of the operator basis {B k } d 2 −1 k=1 , we arrive at where the inequality follows from i Q † i Q i ≤ I. Combined with Eq. (284), this implies that 0 ≤ χ Q 00 ≤ d, or equivalently 0 ≤ F e (Q) ≤ 1.

J Proof of Lemma 3
Here we derive an upper bound on the matrix element χ S•Q 00 of the composite channel S • Q, given the corresponding χ matrix elements χ Q 00 and χ S 00 of the individual channels Q and S, respectively.The technique used for the following derivation is based on [43].Given the Kraus operators j=1 of the individual channels Q and S, the Kraus operators of the composite channel S • Q are given by {S j Q i } i,j for i = 1, • • • , K(Q) and j = 1, • • • , K(S).Therefore, by using Eq.(283) for the individual Kraus operators and using the notation ⟨Q i , Q i ⟩ = q i and ⟨S i , S i ⟩ = s i , we find for the Kraus operators of the composite channel S j Q i = s j q i cos(ϕ S j ) cos(ϕ Q i )B 2 0 + s j q i sin(ϕ S j ) cos(ϕ Q i ) By substituting into Eq.(284) and choosing the operator basis elements to be Hermitian (hence , we arrive at where we have denoted by By recalling that c ij , s ij ≥ 0, since ϕ Q i , ϕ S j ∈ [0, π/2], the second inequality yields By squaring both sides, we get where this inequality is saturated iff v ij = 1 for all i = 1, • • • , K(Q) and j = 1, • • • , K(S).Substituting back into Eq.(287) and using the definitions of χ Q 00 and χ S 00 from Eq. (284), as well as the fact that i q 2 i = j s 2 j = d for CPTP maps Q and S, we arrive at Next, we use the Cauchy-Schwartz inequality where the Cauchy-Schwartz inequality is saturated when the vectors are linearly dependent, therefore ] for all i = 1, • • • , K(Q) and the function tan(x) is one-to-one in that region, it follows that the above inequality is saturated iff ϕ The exact same argument for the channel S yields j s 2 j cos(ϕ S j ) sin(ϕ S j ) ≤ χ S 00 (d − χ S 00 ) , (298 where the inequality is saturated iff ϕ S 1 = • • • = ϕ S K(S) .Substituting Eqs.(297) and (298) into Eq.(293), we arrive at or equivalently, Dividing both sides by d and redefining χ Q 00 /d ≡ cos 2 (δ Q ) for δ Q ∈ [0, π/2], as suggested in Eq. (14), we arrive at cos(δ S•Q ) ≤ cos(δ S ) cos(δ Q ) + sin(δ S ) sin(δ Q ) .
In other words, given χ Q 00 and χ S 00 , the composite channel χ-matrix element χ S•Q where is the Lagrange form of the remainder term, and x 0 ∈ [0, x] is a constant.Substituting for x using Eq.(308) yields is the accumulative remainder term.Next, we rewrite Theorem 4 as a recurrence inequality, where the contribution of the spectator system in each time step is clearly separated.
where we have used the Taylor expansions of the sine and cosine functions to the first order of b, and denoted by R 3 (x) and R 4 (x) with their remainders in the Lagrange form, respectively.Consequently, we have for Theorem 4 the separation where the first term is the interference between the previous n − 1 cycles and the n-th cycle error angles with perfect knowledge at the n-th step.On the other hand, the second term shows the contribution of the lack of knowledge into the recurrence inequality at the n-th timestep, for a fixed error angle δ 1→(n−1) , as (324) Note that the sign of this contribution depends on the difference between the error angles of the previous n−1 cycles and the n-th cycle.Taking the expectation with respect to the probability distribution p X (x n |θ n ) and using Theorem 2, we arrive at the following:

Figure 1 :
Figure1: Recovery with perfect knowledge (time flows from left to right).The quantum memory is prepared in the quantum state ρ.The recovery channel R θ is implemented using perfect knowledge of the noise parameter θ ∈ Θ of the environment noise N θ .

2 .
A general formulation of the spectator-based re-M S (a) State Preparation: A desired state of the quantum memory M , and some metrologically useful state of the spectator system S, are prepared.Free Evolution: Due to the interaction with their joint environment, the states of the memory and the spectator evolve.The evolution of M is parameterized by θ, and the evolution of S by some function f (θ) of θ.Quantum Parameter Estimation: The spectator system is used as a real-time quantum sensor (probe) to find the best estimate θ of the noise parameter θ.Post-Processing: Where the best estimate θ is used to obtain a "best-guess" recovery map R θ , which is optimal, given the incomplete knowledge of the true value of the noise parameter θ.This map is generally different from the truly optimal recovery map R θ , corresponding to the parameterized dynamics of the quantum memory.Best-Guess Recovery : Where the "best-guess" recovery map R θ is applied to the quantum memory, to recover (perfectly or approximately) the quantum information encoded on its initial state.M S(f) Spectator System Recycling: The final step is to recycle the state of the spectator system to prepare it for the next recovery cycle.

Figure 3 :
Figure3: Cartoon description of spectator-based recovery protocols (temporal order corresponds to the alphabetical order of the subfigures).The letters "M " and "S" stand for quantum memory and spectator system, respectively.The black dots represent the environment spins that contribute to the noise.This protocol combines two important disciplines of quantum information theory: quantum parameter estimation and recovery of quantum information.

Figure 4 :
Figure 4: The spectator (S) and memory (M ) systems are subject to a single environment, characterized by the parametric family of quantum channels {Z θ } θ∈Θ .The mother channel Z M S θ generates the two local channels N M S→M θ M

Figure 6 :
Figure 6: Spectator-based [4, 1] code of the amplitude-damping channel (time flows from left to right).A single logical qubit (second register) is encoded into four physical qubits using the encoding channel E : H2 → H ⊗4 2 in Eqs.(71) and(72), where H2 denotes the two dimensional Hilbert space of a single qubit system.The spectator system (first register) performs an unbiased estimate θ(x) of the noise parameter θ using the POVM {Πx}.The estimated value θ is fed into the recovery operation described in detail in[12] that is adapted for the amplitude-damping channel.

Figure 8 :
Figure 8: Time flows from left to right.(a) Multi-cycle recovery protocol with perfect knowledge.(b) Multi-cycle recoveryprotocol with incomplete knowledge.The input state of the quantum memory in both subfigures is given by ρ, whereas the input state of the spectator system in subfigure (b) is ψ.In the latter case, the state of the spectator is recycled back to ψ after every recovery cycle, via a discarding and preparation channel.The final output states I

Lemma 4 .
Given the entanglement fidelities F i e of the single-cycle recovery protocols at each time step ∆t R = t i+1 − t i , the n-cycle entanglement fidelity F 1→n e of the multi-cycle recovery protocol is bounded from above by the (n − 1)-cycle entanglement fidelity F

Figure 9 :
Figure 9: Subplots show the dependence of the accumulated n-cycle entanglement fidelity F 1→n e on the value of the noise parameter θn at the n-th cycle for the [4,1] code of the amplitude-damping channel.The spectator system is taken to have the simplest characteristic parameters (γ = 1, m = 1).The colored regions indicate allowed values for the entanglement fidelity.The blue color refers to the case of perfect knowledge of θn and the orange color to the lack of that knowledge.From top to bottom, the value of the accumulated (n − 1)-cycle entanglement fidelity F 1→(n−1) e is picked to be (a) 0.99, (b) 0.97, and (c) 0.95, respectively.

Figure 10 :
Figure10: Comparison of the maximum performance of spectator-based recovery (subject to varying noise parameter θ) with various recovery optimization approaches to the[4,1] code of the amplitude-damping channel (with fixed noise parameter θ).We consider the "worst case" spectator parameters (γ = 1, m = 1).Shown are the performances of the well-known approximate QEC code in Leung et al.[13], its channel-adapted version by Fletcher et al.[12], its SDP optimized version by Fletcher et al.[14], its stabilizer-based version[12,81], and the incomplete knowledge extension of the channel-adapted QEC in[12].Here, the difference between the "channel-adapted" and "incomplete knowledge" entanglement fidelities showcases the fundamental metrological cost of operating a real-time quantum memory.All other (γ ≥ 1, m ≥ 1) spectator-based recoveries lie above the "incomplete knowledge" graph.
over the uniform Haar measure on the set of d × d unitary matrices U(d) is given by the depolarizing channel