Resource Preservability

Resource theory is a general, model-independent approach aiming to understand the qualitative notion of resource quantitatively. In a given resource theory, free operations are physical processes that do not create the resource and are considered zero-cost. This brings the following natural question: For a given free operation, what is its ability to preserve a resource? We axiomatically formulate this ability as the resource preservability, which is constructed as a channel resource theory induced by a state resource theory. We provide two general classes of resource preservability monotones: One is based on state resource monotones, and another is based on channel distance measures. Specifically, the latter gives the robustness monotone, which has been recently found to have an operational interpretation. As examples, we show that athermality preservability of a Gibbs-preserving channel can be related to the smallest bath size needed to thermalize all its outputs, and it also bounds the capacity of a classical communication scenario under certain thermodynamic constraints. We further apply our theory to the study of entanglement preserving local thermalization (EPLT) and provide a new family of EPLT which admits arbitrarily small nonzero entanglement preservability and free entanglement preservation at the same time. Our results give the first systematic and general formulation of the resource preservation character of free operations.

One important ingredient in a resource theory is the allowed physical processes that will not create the resource, which are called free operations. An ultimate goal for a resource theory is to identify under which conditions a quantity can be transformed into another via free operations. A proper answer can tell us how resourceful the output quantities can be after free operations, giving useful information for both theoretical and practical purposes. This is conceptually related to channel's ability to preserve a resource, which is a phenomenon lacking a quantitative understanding. To be precise, we say a channel preserves a resource if it does not completely destroy the resource for every input. In other words, it can partially degrade the resource, while there must be certain output states that are resourceful (see Fig. 1). This motivates us to ask the following question: Given a free operation, how to quantify its ability to preserve the given resource?
In other words, we are asking for a quantitative study of the qualitative behavior (i.e. the ability to preserve the resource) of free operations, which can be interpreted as a resource theory inherited from the given resource theory. With a rigorous answer, one will be able to identify the efficiency of the given free operation to protect the resource, which will clarify the fundamental structure of free operations in a general resource theory. This question is also motivated by other purposes: For example, a suitable measure of the ability of a given dynamics to preserve entanglement can provide new insights to the study of the interplay between entanglement and thermalization [63]. Also, some previous results have addressed similar issues for entanglement [64], while a general treatment for free operations with arbitrary state resources is still unknown.
In this work, we axiomatically formulate the ability of free operations to preserve a resource of quantum states. This ability, termed resource preservability, is formulated as a channel resource theory induced by the given state resource theory. We provide general assumptions of the formulation, discussing the corresponding free operation, and introducing axioms on the resource preservability monotones.
Two classes of resource preservability monotones are provided: One is induced by the resource monotones of the given state resource theory, with the intuition behind as the maintained resource during the process; another is based on the channel distance from the set of free operations that will destroy the resource. Moreover, the one based on channel distance will induce a robustness-like monotone, with an operational interpretation as the erasure cost of resource preservability due to Ref. [54].
As an example, we consider the resource theory of athermality and show that an resource preservability monotone of a given Gibbs-preserving channel is related to the smallest bath size needed to thermalize all its output states. We further show that the robustness-like monotone serves as an upper bound of the classical capacity of a classical communication scenario subject to certain thermodynamic constraints. These connect thermodynamics and classical communication to our current study. As another application, we apply our theory to the study of entanglement preserving local thermalizations (EPLTs) [63], which are local operation plus shared randomness channels that can locally thermalize subsystems for arbitrary inputs, while keep the global entanglement for certain inputs. We show that EPLTs can admit arbitrarily small entanglement preservability at finite temperatures and preservation of free entanglement [65] simultaneously. This reveals the fact that EPLT is a concept compatible with arbitrarily small ability of entanglement preservation, and can still preserve distillable entanglement at the same time.
This work is structured as follows. We start with basic notions of a general state resource theory and general setup of resource preservability in Sec. 2. After the formal setup, we formulate free super-channel in Sec. 3, and in Sec. 4 we axiomatically introduce resource preservability monotones. In Sec. 5, we consider examples with the resource theory of athermality and apply the theory of resource preservability to EPLT. Finally, we conclude in Sec. 6.

Setup and Assumptions
A resource theory of quantum states, or simply a state resource theory, can be understood as a combination of the following three ingredients: The resource itself (denoted by R), states without the resource (the free states; denote the set of all free states by F R ), and channels that can be applied freely and cannot create the resource (the free operations; denote the set of all free operations by O R ). Hence, a state resource theory can be written as the triplet (R, F R , O R ). A channel resource theory can be defined in a similar way with a state resource theory by replacing states by channels, and the corresponding free operations (O R ) will be super-channels [66,67].
In this work, the only class of channel resource theories will be the one of resource preservability induced by different state resource theories. Hence, for convenience, from now on R-theory means the resource theory of the given resource R of quantum states. The corresponding channel resource theory of resource preservability (abbreviated as R-preservability) will be called an R-preservability theory.
To formulate R-preservability as a channel resource theory inherited from a given R-theory, the first thing is to identify the free channels. To this end, we consider free operations of the given R-theory that cannot preserve resource for every input: Channels of this kind will be called resource-annihilating channels (abbreviated as R-annihilating channels) which is inspired by the name of entanglement-annihilating channel [64]. This set gives the free channels of the R-preservability theory. In view of this notion, every element in O R \ O N This property forbids any possibility to activate the R-preservability. This is, however, not true due to the existence of activation properties of certain resources [68][69][70][71][72]. More precisely, in Appendix A we show that in some R-theories, one can construct a free operation T ∈ O N R such that T ⊗k / ∈ O N R for some integer k > 0. This means if we want to formulate R-preservability theory in a general way applicable to different R-theories, we need to respect certain properties such as the activation of the R-preservability. To impose basic assumptions on R-theory, we need the following concept first: Definition 1. (Absolutely Free State) A free state η is said to be an absolutely free state for the given R-theory if We denote the set of all absolutely free states by F R .
In other words, absolutely free states are those without hidden resource [71,72]. For example, in the R-theory of entanglement, all the separable states are absolutely free states. However, as we have mentioned, there also exist R-theories with states that are not absolutely free: This can be seen by the superactivation of nonlocality [68] and steering [69,70]. We remark that F R is closed under tensor product; that is, η 1 With the above notion, we consider R-theories with the following properties in this work: (R2) Identity and partial trace are free operations.
(R4) Free operations are closed under tensor products, convex sums, and compositions: Let us briefly comment on the above properties. We assume property (R1) because we aim to study R-preservability, which is a comparison of resourceless states and resourceful states. Also, we expect genuinely resourceless states exist and convex sums of resourceless states will not be resourceful, which are common features shared by many R-theories. Property (R2) is assumed because in an R-theory, identity map and partial trace can never increase the amount of resource and will usually fulfill other conditions of a free operation: Conceptually, it means "doing noting" and "ignoring part of the system" are both free and costless. Property (R3) makes sure the resource content will not increase after an extension with an absolutely free state η. Property (R4) is imposed because, for two channels which cannot create the resource, we expect their simultaneous applications (tensor product), classical mixture (convex sum), and sequential applications (composition) still will not have the ability to create the resource. As expected, it is a common property possessed by many choices of free operations in R-theories such as the ones of entanglement [1], nonlocality [17,49], and athermality [23,26] (note that there do exist examples which cannot satisfy this property 2 ). This also implies that in this work the set O N R is always convex. Before the formulation of R-preservability, it is important to introduce the following analog concept of absolutely free states for channels.
We denote the set of all such channels by O N R .
2 To see a counterexample, consider the R-theory of nonlocality with the nonlocality non-generating channels as free operations. Suppose ρ 0 is a local state such that ρ ⊗2 0 is nonlocal [68]. Then the state preparation channel Φρ 0 : (·) → ρ 0 is a nonlocality non-generating channel, while Φρ 0 ⊗ Φρ 0 will always have nonlocal output, thereby being able to generate nonlocality. This again shows the activation property may lead to unexpected results.
Accepted in Quantum 2020-03-18, click title to verify This definition means the R-preservability of absolutely R-annihilating channels cannot be activated. As an example of an absolutely R-annihilating channel, consider again the R-theory of entanglement. Then every local operation and classical communication (LOCC) channel that is entanglement-annihilating [64] and entanglement-breaking [73] will be absolutely R-annihilating channels (see Appendix B for the detailed explanation).
We also remark the following facts for a given R-theory: According to the first line in Eq. (4), a sequential application of free operations cannot preserve any resource (even with the assistance of ancillary R-annihilating channels) if one has already added one absolutely R-annihilating channel in the sequence. Also, since absolutely R-annihilating channels forbid activation, the second line in Eq. (4) means simultaneous applications of two such channels still do not allow activation. Before introducing the main results, we specify notations. In this work we ignore the dependency of system size of the notations O N R and O R . To emphasize the contrast between the main systems and ancillary systems, we use subscripts S, S for the main systems and A,B for the ancillary systems. When only bipartition needs to be addressed, we use the common notations A,B for subsystems. The meaning of subscripts will be clear from the context.

Free Operation of Resource Preservability
To specify the free operation of R-preservability, we need to know first how to map a channel in O R into another channel in O R . The general structure of such mapping (which maps channels to channels) is shown to take the following form [66]: where A stands for the ancillary system, and M, N are some quantum channels. Such mappings are called super-channels [36,54,66,67]. One potential way to introduce free operations of Rpreservability, or simply free super-channels, is to consider all super-channels that will not increase R-preservability. This gives the largest possible set of free super-channels, while it may not always have intuitive and clear physical interpretation (see Ref. [59] for an exception). Also, whether all such mappings can always map elements of O R into O R is still unclear 3 . Hence, in this work we prefer a different approach: We try to impose conditions on Eq. (5) and focus on free super-channels with clear physical meanings.
To this end, we interpret Eq. (5) as a three-step process consisting of a pre-processing (N ), an ancillary process (I A ), and a post-processing (M). The first condition to be imposed is that free super-channels should be realized freely in the given R-theory, since we expect them to be implementable without the assistance of the resource R. This suggests that all steps in Eq. (5) should be free operations of the given R-theory; that is, N , M ∈ O R . The second condition to be imposed is that free super-channels cannot create R-preservability. However, since identity map has the best R-preservability, this may fail if one uses identity map for the ancillary process in Eq. (5). This suggests that the ancillary system should perform certain processes to ensure it is impossible to create R-preservability. Concerning the existence of activation properties discussed in Appendix A, we ask the ancillary system to perform only absolutely R-annihilating channels. The above discussions motivate us to consider the following notion as the free operation of an R-preservability theory in this work:

Definition 3. (Free Super-Channel of R-Preservability)
In this work, the free operation of Rpreservability, or say the free super-channel F : O R → O R , is of the form where Λ + , Λ − ∈ O R are free operations of the R-theory and Λ A ∈ O N R is an absolutely Rannihilating channel.
For the generality of the R-preservability theory, we allow different input/output dimensions of the free super-channels 4 , which means the R-preservability of the given channel on the main system S may be assisted by channels acting on ancillary systems, while the ancillary channels need to obey the rules: They cannot provide additional R-preservability, and they cannot be assisted by the given state resource R.
Note that if one simply assumes Λ + , Λ − to possess zero R-preservability, then the output will only be R-annihilating channels. Hence, we allow Λ + , Λ − to be arbitrary free operations. Also, we This ensures that Eq. (6) is a suitable free operation even with the activation property of R-preservability (Appendix A).

Resource Preservability Monotone
An important feature of a resource theory is that it provides a way to quantify the resource [36]. Let Q be the set of all states or all channels. Then a resource monotone of the given resource R is a function Q R : Q → [0, ∞] satisfying properties (M1) and (M2): It is called convex if it also satisfies property (M3), and it is called faithful if it also satisfies property (M4). To avoid trivial case, we always assume Q R (q) > 0 for some q in this work. With the above notions, we are now in position to introduce the R-preservability monotones.

Definition 4. (Resource Preservability Monotone)
In an R-preservability theory, an R-preservability monotone P R is a channel resource monotone satisfying the following additional property: and the equality holds if E ∈ O N R . This additional property illustrates the basic expectation of a good quantifier of R-preservability: R-preservability will not decrease under tensor product, and it will not increase under tensor product with absolutely R-annihilating channels. Note again that we do not impose the property to the existence of the activation property discussed in Appendix A. It is still possible for an R-preservability monotone to satisfy this property, which simply means that monotone cannot witness activated R-preservability.
We introduce two classes of R-preservability monotones, whose underlying intuitions are stated as follows: • Interpret R-preservability as the ability to maintain resource during the operation.
• Interpret R-preservability as the channel distance from the set of R-annihilating channels.
While they originate from different concepts, in the following sections we will show that both of them admit R-preservability monotones.

Resource Preservability Monotone: The Maintained Resource
For a given resourceful state ρ and a given state resource monotone Q R , an intuitive way to quantify the ability of a free operation E S to preserve the resource R of ρ is to compare the difference between Q R (ρ) . This proposes the following general candidate induced by Q R : (we use subscript to denote the corresponding subsystems) where f is a finite-valued strictly increasing function with f (0) = 0, g is a non-decreasing function satisfying g −1 ({0}) ⊆ {0} [this means the only x that may achieve g(x) = 0 is x = 0]. Here we use the following abbreviation: where the maximization is taken over all possible finite dimensional ancillary systems A, all absolutely R-annihilating channels Λ A ∈ O N R on the ancillary system A, and all states ρ SA on the composite system SA. In the maximization we allow the ancillary system to have zero dimension, corresponding to the original system S. We stress that the maximization in Eq. (8) is restricted to ρ SA achieving non-zero Q R values. This makes sure the value is always finite.
The idea behind Eq. (8) is to consider a general ratio between the input and the output of the given free operation. By considering particular combinations of f and g, we have the following candidates: The first one can be interpreted as the optimal maintained resource during the process E, and the second one can be understood as the optimal remaining amount of resource in the end of the process E. Note that we do not use identity map I A for the ancillary systems in the above definition. This is because identity channel is the most resourceful channel, and considering ancillary system with it may create "artificial R-preservability". For example, if one uses identity for the ancillary systems in the R-theory of entanglement, then one will have non-zero R-preservability for entanglement-annihilating channels that are not entanglement-breaking [64]. Merely using Rannihilating channels O N R for the extension is still not enough due to the existence of the activation property (Appendix A). This explains the need of introducing absolutely R-annihilating channels.
We now present the first main result, whose proof is given in Appendix C. Recall that R-theory represents a state resource theory with resource R. Theorem 1. Given an R-theory and a state resource monotone Q R . Then P (f,g) As a remark, the assumption F R = ∅ is only used in the proof of Eq. (7), and this assumption can be dropped when g is a positive constant. We state this special case in Corollary C.1. Also, it will be an interesting future research topic to study specific operational interpretations of different combinations of f, g with different R-theories.

Resource Preservability Monotone: The Channel Distance
One intuitive way to quantify a resource is to consider the distance away from the set consisting of quantities without the resource. Here we use the similar way to interpret R-preservability. To this end, we consider a general distance measure on states defined as a function D : S × S → [0, ∞] satisfying D(ρ, σ) ≥ 0 and equality holds if and only if ρ = σ (S is the set of quantum states). Now, we introduce the following candidates induced by D to quantify R-preservability: Again, we use the abbreviation introduced in Eq. (9), and sup A;ρ SA means the maximization taken over all the ancillary systems A and the states ρ SA on SA. Note that unlike the previous section, since now we only compare the distance between two channels, using identity to extend the system is allowed, and this is the reason why we list two candidates here. Before introducing the main result, we say a set A is closed under the distance measure D if for every sequence {Λ k } ∞ k=1 ⊆ A satisfying lim k→∞ sup ρ D[E(ρ), Λ k (ρ)] = 0, we will have E ∈ A. We now provide the following result, whose proof is given in Appendix D.

Theorem 2.
Given an R-theory and a distance measure D satisfying the property Note that Eq. (13) is a relaxed version of the data-processing inequality. As a remark, Eq. (13) and condition (R4) imply the ordering P D ≤P D .

Resource Preservability Monotone: The Robustness
We will provide a detailed example in this section to illustrate Theorem 2. In short, with a specific distance measure, a robustness-like monotone can be obtained. To start with, consider the max-relative entropy defined by [74]: where the minimization is taken over all non-negative integer λ, and in this work we always consider logarithm to the base 2. D max fulfills [74] (1) D max (ρ σ) ≥ 0 and the equality holds if and only if ρ = σ, (2) (data-processing inequality) D max [E(ρ) E(σ)] ≤ D max (ρ σ) for all channels E and states ρ, σ. Hence, it satisfies Eq. (13). Theorem 2 means P Dmax andP Dmax are both R-preservability monotone, and they are faithful if O N R is closed under D max . It turns out that this fact implies a direct robustness form and the corresponding operational interpretation based on Ref. [54]. To see this, define the R-preservability log-robustness according to Ref. [54]: where the optimization is taken over all channels C. This quantity depicts how robust the Rpreservability of E is when it is interrupted by another channel. From Ref. [54] we learn that P Dmax = L R . This meansP Dmax may have the same operational interpretation with L R . To formally illustrate this, we now translate the Definition 9 in Ref. [54] into the following version for Rpreservability (in what follows, the diamond norm is defined by where the maximization is taken over all ancillary systems A and states ρ SA on the system SA, and ρ 1 := tr|ρ| = tr ρ † ρ is the trace norm): The -destruction cost for R-preservability is defined by C R (E S ) := log 2 min k, where the minimization is taken over all -destruction process of R-preservability for E S .
Definition 5 is slightly different from the Definition 9 in Ref. [54]. In Ref. [54], U i and V i are asked to be free channels, which will correspond to O N R in our current study. While this will always lead to zero R-preservability for the output channel, we relax this condition in this work. Also, we require the ancillary channelΛ S to be an absolutely R-annihilating channel.
To state the result, we also consider the smooth version of P Dmax andP Dmax [54]: Now we state the following result when the given R-theory admits no activation of R-preservability.
We note that although we write it as a theorem, conceptually this result is a corollary of Theorem 10 in Ref. [54]. We give the proof in Appendix E for the completeness of this work.

Theorem 3.
Given an R-theory satisfying the following three conditions: Then for a given E ∈ O R and for any 0 < η ≤ < 1, we havē Theorem 3 provides a clear operational meaning ofP Dmax (E): It shows how robust the Rpreservability of the given free operation E is when it is randomized over reversible free unitary operations together with an ancillary absolutely R-annihilating channel. This can also be interpreted as the erasure cost of R-preservability. Note that we assume no activation property of R-preservability. When the given R-preservability can be activated, the lower bounds in Theorem 3 can still be proved, while it is so far unclear whether the upper bound can also be obtained.

Applications To Thermodynamics
After introducing the general framework, one may ask for specific examples to illustrate Rpreservability. Specially, a natural question is whether there is any application. These issues will be addressed in the following two sections with the focus on thermodynamics. We remark that detailed studies of coherence preservability have been reported in Ref. [75] recently.

Thermodynamic Implications of Athermality Preservability
We will give two examples by considering the R-theory of athermality with Gibbs-preserving maps as the free operations. It turns out that the R-preservability monotones of athermality (or simply athermality preservability) given by Eqs. (11) and (12) with max-relative entropy D max can be directly related to two recently reported results: For a given Gibbs-preserving channel N , P Dmax (N ) [Eq. (11)] is operationally related to the smallest bath size needed to thermalize all outputs of N [76], andP Dmax (N ) [Eq. (12)] is an upper bound of the single-shot classical capacity of N in a classical communication scenario subject to thermodynamic constraints [59]. These illustrate how R-preservability can be related to existing results and provide new physical messages.
To start with, we define the R-theory of athermality. The term "athermality" means the status that a system is out of thermal equilibrium. With a fixed system size, the only state without this resource is the unique thermal state. Formally, consider a given system S with dimension d.
Suppose the system Hamiltonian is H S and a temperature T is also given. Then the corresponding thermal state reads where β = 1 k B T is the inverse temperature and k B is the Boltzmann constant. In this R-theory, all free states take the form {γ ⊗k | k ∈ N}, where we only consider dimensions of the form d k with some positive integer k. The free operations will be the Gibbs-preserving maps, which are channels E satisfying E(γ ⊗k ) = γ ⊗l for some k, l ∈ N (note that k, l are uniquely determined by the input/output dimensions). Intuitively, these channels are those which cannot drive thermal equilibrium states out of equilibrium, thereby being unable to create athermality. Equipped with Gibbs-preserving maps, the corresponding R-theory will satisfy properties (R1), (R2), (R3), and (R4).

Athermality Preservability And Bath Size
As the first example, we will demonstrate that athermality preservability of a Gibbs-preserving channel can be naturally linked to the bath size needed to thermalize a system. To this end, we use the framework and a thermalization model introduced by Ref. [76]. We will briefly explain the ingredients relevant to this work, and we refer the readers to Ref. [76] for further details.
We begin by specifying the system and the bath in the thermalization scenario. Consider a system S with Hilbert space H S and a bath B with Hilbert space H ⊗(n−1) S (n ∈ N). The bath is assumed to possess the temperature T and the Hamiltonian where H S is the Hamiltonian of the given system S. Let γ be the thermal state associated with T and H S . Then we assume the bath is initially in the state γ ⊗(n−1) . The central question is to study how large the bath needs to be in order to successfully thermalize the system S in a given state ρ S .
To this end, a global channel Here the aim is to study thermalization of a fixed and given input state in the sense that the channel will globally thermalize the system SB. To model the system-bath interaction for thermalization, we consider the following master equation introduced by Ref. [76]: where ρ SB (t) is the state on the global system SB at time t, U SB models an elastic collision between certain subsystems in SB. We refer the reader to Ref. [76] for the detailed framework.
Now, let C n be the set of all channels acting as SB → SB that can be generated by the model Eq. (22) with a bath of size n−1 and a realization time t. Then consider the following quantity [76]: This quantity can be interpreted as the smallest bath size needed to -thermalize the given state ρ S with the thermalization model Eq. (22). It turns out that this concept can be generalized to channels. Define which is the maximization over all the smallest bath sizes among all outputs of N (note that here we only consider channel N with the output space S, which is the main system admitting the given thermal state γ). Then this can be interpreted as the smallest bath size needed to -thermalize all outputs of N under the given thermalization model. One can therefore understand this quantity as the minimal bath size associated with the channel N . Now we are in the position to provide the following bounds, whose proof is given in Appendix F. This can be regarded as a generalization of the main results of Ref. [76] to channels: Theorem 4. Given a Gibbs-preserving map N and 0 ≤ < 1, we have Moreover, if we further assume that γ is full-rank, N is coherence-annihilating, and the system Hamiltonian H S satisfies the energy subspace condition (Definition F.1), then we have where p min (γ) is the smallest eigenvalue of γ.
We remark that by saying "coherence-annihilating" we mean the channel can only output states diagonal in the given energy eigenbasis (i.e. no coherence can survive). This requirement plus the energy subspace condition (Definition F.1) are necessary for the proof of the lower bound derived in Ref. [76] (specifically, it is required by Lemma 17 in the Appendix C of Ref. [76]). An open question in this research line is whether one can derive a similar lower bound without these constraints.
As expected, for a Gibbs-preserving channel N , the quantity B (N ) can be understood as a measure of the robustness of the channel N against thermalization. From the upper bound, we learn that the weaker the channel's ability to preserve athermality is, the smaller a heat bath needs to be to thermalize every output of N . Theorem 4 builds a link between the ability to preserve athermality and the resource needed to thermalize all the outputs of a given channel, and it also gives athermality preservability a different thermodynamic interpretation.

Athermality Preservability And Classical Communication
It turns out that the robustness-like monotoneP Dmax can be related to a classical communication scenario subject to certain thermodynamic constraints, which is the second example in this section. To start with, consider the communication scenario in which we want to send classical information (in terms of classical bits, which can be written as an orthonormal basis {|m } M −1 m=0 ) via a channel N . Again, we always assume the channel N has the output space S, which is associated with a thermal state γ and hence an R-theory of athermality. Here are two constraints for this communication setup: • The whole process (including encoding and decoding) cannot create athermality.
In other words, this means the corresponding physical system used to implement the channel and transmit the classical information can only get closer and closer to thermal equilibrium.
• The dynamics of N has a time scale much longer than the thermalization time scale in the environment.
Theoretically, this motivates us to approximate the dynamics of the surrounding system A by the full thermalization Φ γ A : (·) → γ A , where γ A is the thermal state of the environment (that is, γ A = γ ⊗k for a positive integer k). Together with the non-signalling constraints discussed in Ref. [59] (whose framework is briefly introduced in Appendix G), we have the following theoretical model for this communication scenario: where E e , E d , N are all Gibbs-preserving maps (E e and E d can be interpreted as the encoding and decoding maps, respectively), and in the ancillary system A there is a full thermalization channel Φ γ A . We call this a thermalized classical communication scenario, which is identical to the communication scenario given in Ref. [59] (see also Appendix G) subject to the thermodynamic constraints given above. The central goal is to understand how much classical information can be sent within a given error. Formally, the classical information is indicated by an orthonormal basis {|m } M −1 m=0 , and we are interested in how many basis elements can be recovered in the end of the whole process. To this end, we use the averaged error ε(N , E e , E d , γ A ) of a given combination (N , E e , E d , γ A ) to evaluate the faithfulness of the output: We can now define the following single-shot classical capacity with an error 0 < < 1: where the maximization is taken over all the possible Gibbs-preserving channels E e , E d and ancillary systems A. This quantity tells us the optimal performance of the channel N in a thermalized classical communication scenario: It is the highest amount of classical information that can be transmitted within the given error . As a remark, in the realistic situation the size of the environment cannot be as large as we want. This means in the practical setup one cannot optimize over all the possible ancillary systems, and the above quantity is an upper bound of the realistic capacity in general. It turns out that this quantity is upper bounded by the athermality preservability monotonē P Dmax . Hence, when the given channel N has a weak ability to maintain athermality, then it can neither have a good performance in a thermalized classical communication scenario. Formally, we have the following result, which is conceptually a corollary of Theorem 3 in Ref. [59]. The proof can be found in Appendix H.

Theorem 5. For a Gibbs-preserving map
Theorem 5 gives an alternative operational interpretation of Eq. (12) in the case of athermality: It is an upper bound of the optimal performance in a thermalized classical communication scenario. Being consistent with the intuition, this result implies that if the given Gibbs-preserving channel has highly thermalized output states, then it can hardly keep the encoded classical messages through a scenario that cannot drive the system away from thermal equilibrium.

Application to Entanglement Preserving Local Thermalization
As another application, we apply the theory of R-preservability to the study of entanglement preserving local thermalization (EPLT) [63], which is a topic aiming to understand the interplay between globally distributed quantum correlation and locally performed thermalizations. The central question of EPLT is: Can entanglement survive subsystem thermalizations? To formulate the question, suppose an unknown input state is distributed to two local agents A and B. We assume that the agents can neither use quantum resources (e.g. sharing a maximally entangled state), nor can they communicate with each other. Both of them possess an individual local heat bath (and hence a given local thermal state γ X ; X = A, B), and we allow classical correlation between the local heat baths. When both local agents let their local systems interact with the local heat baths and thermalize, the question is whether we can have global entanglement after thermalizations are achieved locally, at least for certain input states.
The above question can be formulated information-theoretically as follows. Formally, a local operations plus shared randomness (LOSR; see Appendix A.1 for the definition) channel E is called a local thermalization to a pair of single party thermal states In other words, it is a full thermalization channel to γ X in the local system X [which is different from the state-dependent definition given by Eq. (21)]. An EPLT is defined to be a local thermalization that can preserve entanglement for certain inputs; that is, it is a local thermalization with non-zero entanglement preservability. A physical message from the existence of EPLT is when local agents couple to a global heat bath that only admits classical correlations within, it is still possible for global entanglement to survive after subsystem thermalizations: Classical correlations in the bath are enough to protect entanglement from being destroyed by locally performed thermalizations.
Recently, the existence of EPLT has been proved for all nonzero local temperatures and finiteenergy local Hamiltonians in a bipartite setup with the help of shared randomness 5 [63]. This means there is no temperature and energy thresholds for the existence of EPLT; that is, EPLT has no "thermodynamic threshold" to exist. From this a natural question is to ask whether there is any "correlation threshold"; in other words, is it true that EPLT can exist only when its ability to preserve entanglement is strong enough? With the formulation of R-preservability in hand, we can now answer this question. As proved in Appendix I, the result we found suggests that EPLT is a phenomenon generic for different values of the entanglement preservability [note that the R-theory of entanglement with LOSR channels as free operations will satisfy properties (R1), (R2), (R3), and (R4)].

Theorem 6.
For every full-rank γ A , γ B and every δ > 0, there exists an entanglement preserving local thermalization E to (γ A , γ B ) such thatP Hence, the existence of EPLT is generic both in thermodynamic measures (temperature and energy) and correlation measure (entanglement preservability). Moreover, this physical message can actually be generalized to the preservability of free entanglement [65]. Before stating the main result, we specify terminologies. In what follows, the normalized temperature of the given local system X is defined by τ FE is the set of all LOSR channels that cannot preserve free entanglement [65]. Using a new family of EPLT constructed in Appendix J [Eq. (105)], we prove the following result in Appendix K (d is the common local dimensions of both subsystems): where p min is the smallest eigenvalue among γ A and γ B . For every δ > 0, there exists a finite value τ δ > 0 such that for every pair (γ A , γ B ) with min X τ X > τ δ , there exists an entanglement preserving local thermalization E − to (γ A , γ B ) such thatP That is, E − can preserve free entanglement.
Hence, for arbitrarily small entanglement preservability, there always exists a finite temperature EPLT that can also preserve free entanglement. In other words, while they preserve arbitrarily little entanglement, many copies of some output can be distilled back to a maximally entangled state by LOCC channels. Hence, the conclusion that EPLT exists without a correlation threshold is the same even when we use the preservability of free entanglement as the measure.
We also remark that sincē Eq. (33) automatically implies a lower bound of the entanglement preservability. For high normalized temperatures, we have p min → 1 d and the bound in Eq. (33) becomes arbitrarily close to and T is an EPLT at infinite normalized temperature [63].

Conclusions
In a given resource theory of quantum states, we quantify the ability of free operations to preserve the resource. To this end, we formulate this ability, termed resource preservability, as a channel resource induced by the given state resource. Two classes of resource preservability monotones are proved: One is induced by state resource quantifiers, and another is based on channel distance measures. The latter also induces a robustness-like measure with operational interpretation as the erasure cost of resource preservability [54].
To illustrate the connection between resource preservability and other research directions, we consider the resource theory of athermality. We provide physical interpretations of two ahtermality preservability monotones induced by max-relative entropy. One has a thermodynamic interpretation directly related to the bath size needed for thermalization: The ability of a Gibbs-preserving channel to preserve athermality is physically connected to the minimal bath size needed to thermalize all its outputs. Another monotone is shown to bound the capacity of a classical communication scenario under certain thermodynamic constraints. By adding thermodynamic conditions to a general classical communication setup, the ability for the given Gibbs-preserving channel to preserve athermality tells us the highest possible amount of transmissible classical messages.
As another application, we study the entanglement preservability of entanglement preserving local thermalizations (EPLTs) [63], which is a family of local operation plus shared randomness channels that locally behave as thermalization for arbitrary inputs, while globally have the ability to preserve certain amounts of entanglement. In this work, we show that EPLT can exist with arbitrarily small entanglement preservability for every positive temperatures. Hence, EPLT's existence is independent of both temperature constraints and the ability to preserve entanglement. We further provide a new family of EPLTs that has the ability to preserve free entanglement, even though its entanglement preservability can be arbitrarily small at finite temperatures. This suggests the existence of EPLT is generic in various values of free entanglement preservability.
Several open questions remain. From the operational perspective, it will be interesting to know whether there is any operational interpretation of R-preservability monotones induced by state resource monotones introduced in Sec. 4.1. Also, the robustness-like measure introduced in Sec. 4.3 is shown to have an operational interpretation [54] when the given R-preservability theory has no activation property, while it is unknown whether this operational interpretation can still hold when the given R-preservability allows activation. Regarding the structure of channel resource theory, it is so far unknown how to characterize the largest set of free super-channels of R-preservability (since it may not always be the set of all super-channels that cannot generate R-preservability as discussed in footnote 3). Finally, it is also an open question whether one can drop the temperature dependency of entanglement preserving local thermalizations in Theorem 7. We hope this work can initiate the interest in the study of resource preservation properties in different state resource theories.
Note added. Recently, we became aware of the related work Ref. [75] which consider the preservation of coherence as a channel resource.

A Remark on the Activation Property of Resource Preservability
In this section, we provide an example of activation property of R-preservability. Consider the Rtheory of nonlocality [4,5] (and we write R = NL) on a bipartite system SS with equal finite local dimension D, and local operations plus shared randomness (LOSR) channels as the free operations [17,49] (in Appendix A.1 we briefly explain the reason). First, we recall a phenomenon called superactivation, which is proved for nonlocality [68] and generalized to quantum steering [69,70] (and we also mention other activation properties of nonlocality in Refs. [71,72]). Formally, a local state ρ (with local dimension D = d) is said to admit superactivation of nonlocality if there exists a finite k ∈ N such that ρ ⊗k is nonlocal (in the bipartition SS and local dimension D = d k ). We refer the readers to Appendix A.1 for the definition of local/nonlocal states. In SS with D = d, it is shown that a state can demonstrate superactivation of nonlocality if its fully entangled fraction (FEF) is higher than 1 d [77], where for the given bipartite system the FEF is defined by [11,78]: The maximization is taken over all maximally entangled states |Φ d on the given bipartite system SS . FEF is well-known for its capacity to characterize various quantum properties [1,5,11,69,70,[77][78][79][80][81].
To construct the example, we make use of the (U ⊗ U * )-twirling operation on SS defined by [82,83] T (·) := where the integration is taken over the group of d×d unitary operators U (d) with the Haar measure dU . The twirling operation T is by definition an LOSR channel, thereby being a free operation. It has the property to preserve entanglement: Also, the output of T will always be an isotropic state [82]: where n=0 |n ⊗ |n is a maximally entangled state, and p ∈ − 1 d 2 −1 , 1 due to the positivity of quantum states. Now we consider the following channel: and we choose p such that the output state cannot have FEF larger than the threshold for nonlocality of isotropic states [5], while can still have FEF larger than 1 d for certain entangled inputs. More precisely, we choose [5,84] which will guarantee the above claim. Being an LOSR channel, this means T ∈ O N NL . Also, when the input state is |Ψ + d , T (|Ψ + d Ψ + d |) will be an entangled isotropic state, thereby having FEF > 1 d and hence admitting superactivation of nonlocality. Hence, when one consider T ⊗k with a large enough k, it is possible to output nonlocal states (on the given bipartition SS with local dimension D = d k ), which means T ⊗k / ∈ O N NL . This illustrates the existence of superactivation property of nonlocality preservability, which also teaches us that for a general formulation, the assumption As a remark, we note that there do exist examples without activation property. For instance, if we use Gibbs-preserving map as the free operation in the R-theory of athermality, then the only Rannihilating channel is the state preparation channel of the given thermal state [Eq. (20)]. Because product local thermalization cannot preserve any correlation [63], we learn that it is impossible to activate resource preservability in this case.

A.1 Local Operations Plus Shared Randomness Channels
In this section, we briefly explain why LOSR channels can be free operations of nonlocality. It suffices to consider a bipartite system AB. Formally, an LOSR channel is defined to take the following form: where the integration is taken over the variable λ and E A λ , E B λ are local channels. In what follows we will write {E a|x } as a set of local positive operator-valued measures (POVMs) [13]; that is, for each input value x, E a|x 's form an POVM: E a|x ≥ 0 ∀ a and a E a|x = I A ∀ x. We use the notation {E b|y } for the POVMs in the subsystem B.
With the above setting, a quantum state ρ AB is said to be local if for every local sets of POVMs {E a|x }, {E b|y } one can write [4,5] tr (E a|x ⊗ E b|y )ρ AB = λ∈Λ LHV P (a|x, λ)P (b|y, λ)p λ dλ (43) for some variable λ in a set Λ LHV and some probability distributions P (a|x, λ), P (b|y, λ), p λ . In other words, a state is local if all the possible combinations of local POVMs cannot distinguish it with a local hidden-variable model, as depicted by Λ LHV . Any state that is not local is said to be nonlocal. Now we explain that LOSR channel will map local states to local states. To see this, we note that for a given LOSR channel E, we have where for X = A, B, E X, † λ 's are completely-positive unital map since E X λ 's are completely-positive trace-preserving map. This means E A, † λ (E a|x ) and E B, † λ (E b|y ) again form local sets of POVMs. Since ρ AB is local, the quantity tr E A, † λ (E a|x ) ⊗ E B, † λ (E b|y ) (ρ AB ) must take the form of Eq. (43). This shows that LOSR channels map local states to local states, and hence form a suitable candidate of free operations for nonlocality.

B Example of Absolutely Resource Annihilating Channels
Using the R-theory of entanglement, we will show that every channel that is entanglementannihilating and entanglement-breaking will be an absolutely R-annihilating channel (and we also say it is absolutely entanglement-annihilating).

Fact B.1. If a bipartite channel E is entanglement-annihilating and entanglement-breaking, then it is absolutely entanglement-annihilating.
Proof. We rewrite this channel as E A1B1 , where the subscript means that it is in the bipartite system A 1 B 1 . Then it suffices to show that there will be no entanglement in the AB bipartition after the product channel E A1B1 ⊗ Λ A2B2 , where Λ A2B2 is an entanglement-annihilating channel in the bipartite system A 2 B 2 and it annihilates entanglement in the AB bipartition.
Because E A1B1 is entanglement-breaking, this means E A1B1 ⊗I A2B2 is entanglement-annihilating in the 12 bipartition. In other words, there will be no entanglement in the 12 bipartition after E A1B1 ⊗Λ A2B2 . Hence, the remaining possibility for the preserved entanglement are in the bipartite systems A 1 B 1 and A 2 B 2 . But since both E A1B1 and Λ A2B2 are entanglement-annihilating in the AB bipartition, we conclude that no entanglement exists in the bipartite systems A 1 B 1 and A 2 B 2 . This shows that the output states cannot be entangled in the AB bipartition.

C Proof of Theorem 1
Proof. To show property (M1), note that for a given Λ = 0 for all Λ A ∈ O N R and for all ρ SA . Hence, property (M1) is proved.
To show property (M2), we recall from Definition 3 that for a given free super-channel F E acting on free operations E ∈ O R , there exist an ancillary system B, two free operations Λ + , Λ − ∈ O R , and an absolutely R-annihilating channel Λ In what follows, because the input/output dimensions of Λ − do not need to be the same, we write S as the input space and SB as the output space of Λ − ; namely, we have Λ − : S → SB. Then we have [note that the maximization is taken over ρ SA satisfying Q R (ρ SA ) > 0 according to our definition; see explanations below Eq. (9)] The second line is because Q R is non-increasing under free operation (Λ + ⊗ I A ), which is due to the properties (R2), (R4), (M2), and the fact that f is strictly increasing. The same reasons imply the third line (while with some subtleties explained below). The fourth line is because maximizing over all states of the form (Λ − ⊗ I A )(ρ SA ) is sub-optimal than the range of all states on the system SBA. The fifth line is because Λ B ⊗ Λ A gives a range that is sub-optimal than all the possible Λ A when one maximizes over all the ancillary systems A [recall from Eq. (4) that the set of absolutely R-annihilating channels for an R-theory satisfying properties (R1), (R2), (R3), and (R4) is closed under tensor product].
Here we note that the ranges of optimization in the second line and the third line are different. In the second line, the optimization is taken over ρ S A with Q R (ρ S A ) > 0, which implies two different cases. The first case is when the optimization over this range is zero [sup A (...) = 0 in the second line]. Then in this case the desired inequality holds. This means we can assume the second case without loss of generality; that is, we can assume the optimization in the second line over Q R (ρ S A ) > 0 gives nonzero value. Hence, the range for the second line can be rewritten as ρ S A with Q R (ρ S A ) > 0 and Q R [(Λ − ⊗ I A )(ρ S A )] > 0, since the latter inequality is necessary for a nonzero numerator (note that actually the latter inequality implies the former one, while we still write them both explicitly for understanding). Then one can proceed to the third line with this condition. This proves property (M2).
To prove the property given by Eq. (7), we first note the following: (the maximization is again taken over states with non-zero Q R values) Note that sup A in the first line is maximizing over the system SS A. The second line is because fixing an absolutely free state η S ∈ F R [here we use the assumption F R = ∅ in property (R1)] will make the maximization sub-optimal than the original one, and we note that since this line sup A is maximizing over SA [with Q R (ρ SA ) > 0]. The third line is because f is strictly increasing and Q R is a resource monotone [property (R2)]. The fourth line is because g is non-decreasing and Q R is a resource monotone [property (R3)]. This proves the inequality in Eq. (7) for general E S and E S . In the case that E S = Λ S ∈ O N R , we have where the second line is because the range Λ S ⊗ Λ A with the fixed Λ S is sub-optimal than all the possible absolutely R-annihilating channel Λ A when one maximizes over all the ancillary systems A [recall again from Eq. (4) that the set of absolutely R-annihilating channels will be closed under tensor product in the current case]. This shows the equality in Eq. (7).
Finally, when f •Q R is convex, P (f,g) Q R is by definition convex. This proves property (M3). To address property (M4), for a given E S ∈ O R we note that P (f,g) R , and all ancillary systems A. By considering the ancillary system as the trivial one (i.e. with zero dimension), we have Q R [E S (ρ S )] = 0 for all ρ S . when Q R is faithful, this means E S (ρ S ) ∈ F R for all ρ S , thereby implying E S ∈ O N R . This shows property (M4) and also completes the whole proof.
We remark that the assumption F R = ∅ is only used in the proof of Eq. (46). In other words, this assumption can be dropped if g maps every input to a positive constant. Write g c (·) = c, this means the following corollary for an R-theory satisfying the rest of properties (R1), (R2), (R3), and (R4): Corollary C.1. Given an R-theory and a state resource monotone Q R . f is a finite-valued strictly increasing function with f (0) = 0 and c > 0 is a positive constant. Then P

D Proof of Theorem 2
Proof. Property (M1) holds automatically according to the definition. To prove property (M2), for a given free super-channel R , the direct computation shows (we again adapt the notation Λ − : S → SB) The second line is because which is true because of the assumptions that we made for R-theories in this work] forms a sub-optimal range compared with Λ S ∈ O N R . The third line is because of the properties (R2) and (R4), plus the fact that D satisfies Eq. (13). The fourth line is because (Λ − ⊗ I A )(ρ S A ) forms a sub-optimal range for the maximization sup A . The fifth line is because Λ S ⊗ Λ B ∈ O N R (this is true due to the definition of the absolutely R-annihilating channels) with the fixed map Λ B ∈ O N R and the variable Λ S forms a sub-optimal range for the minimization inf Λ SB ∈O N R . The sixth line is because Λ B ⊗ Λ A forms a sub-optimal range for the maximization sup A [recall Eq. (4)]. This proves property (M2).
To prove Eq. (7), we first compute the following In the second line we pick a fixed absolutely free state η S ∈ F R , which is possible due to the property (R1). Then the second line follows from the fact that ρ SA ⊗ η S forms a sub-optimal range for the maximization sup A . The third line is because of Eq. (13). The fourth line is because the mapping tr S {Λ SS [(·) ⊗ η S ]} will be an R-annihilating channel [properties (R2), (R3), and (R4)]. This consequently implies a sup-optimal range for the minimization compared with inf Λ S ∈O N R . Then the inequality in Eq. (7) is proved.
To show the equality, we compute the following for a given Λ S ∈ O N R : The second line is because Λ S ⊗ Λ S with the fixed Λ S forms a sub-optimal range for the minimization compared with inf Λ SS ∈O N R . The third line is because Λ S ⊗ Λ A forms a sub-optimal range for the maximization of sup A [Eq. (4)]. This proves the equality and Eq. (7).
Finally, suppose for a given channel E we have P D (E) = 0. By definition this implies since we are allowed to consider the zero-dimensional ancillary system. Hence, there exists a sequence This proves property (M4), and the proof for P D is completed. The case forP D is almost the same: One simply needs to replace Λ A and sup A by I A and sup A;ρ SA , respectively. Also we remark that the proof of property (M2) forP D is a direct application of Theorem 1 in Ref. [55]. This also meansP D can be a monotone if we consider the largest set of possible free super-channels, whenever this set is well-defined (see Ref. [? ] for the explanation).

E Proof of Theorem 3
To sketch the proof, we note that Theorem 10 in Ref. [54] is true even without assumptions 3 in their paper, which is crucial for R-preservability theories since the identity channel can never be a free channel. Using all the listed assumptions in Theorem 3, one can prove the upper bound by the same strategy in Ref. [54]. Also, the small difference between Definition 5 in this work and Definition 9 in Ref. [54] will not change the proof of the lower bound.
For the completeness of this work, we still state the detailed proof in this section. Before the proof, we recall the Generalized Convex-Split Lemma for completely-positive maps [54]: [54] Let α, β be completely-positive maps with α = β = 1. Suppose there exists a completely-positive map α with α ≤ 1 and p ∈ (0, 1] such that β = pα + (1 − p)α . Then the validity of the inequality log 2 n ≥ log 2 1 p + 2 log 2 1 δ implies the following estimate Before the main proof, we note the following two facts. The first one hasP Dmax = L R [Eq. (15)] as a direct consequence.

Fact E.2. Given two channels E and Λ. Then we have
Proof. Define the set L A := {λ | 0 ≤ [(λΛ − E) ⊗ I A ](ρ SA )} with A denotes a particular combination of an ancillary system A and a state ρ SA on the system SA. Then the left-hand-side can be written as sup A inf{λ | λ ∈ L A }, and the right-hand-side can be written as inf {λ | λ ∈ A L A }.
With the above notations, the inequality "≤" follows by the fact that A L A ⊆ L A for all A . On the other hand, consider a given k ∈ N. Then there exists an In other words, this means Since this is true for all k ∈ N, the result follows.
The second fact is a property similar to Eq. (7) for the smooth version ofP D defined similarly to Eq. (18).

Fact E.3.
Given an R-theory and a distance measure D satisfying Eq. (13). Then for every δ ≥ 0 and channels E S , Proof. First, we have the following definition similar to Eq. (18): A direct computation shows The second line is because fixing an absolutely free state η S [which is possible due to property (R1)] forms a sub-optimal range for the maximization. The third line is a consequence of the dataprocessing inequality under partial trace [namely, Eq. (13) Again, the second line is due to the sub-optimal range for the maximization when we fix an absolutely free state η S . Also, the third line follows from the data-processing inequality (or equivalently, the contractivity) of trace norm under quantum channels.
The second line follows from Eq. (58), which implies that all C SS satisfying 1 2 C SS − E S ⊗ E S ≤ δ form a subset of the set of all C SS satisfying 1 2 C S − E S ≤ δ. Since all possible C S form a subset of all possible channels E satisfying 1 2 E − E S ≤ δ, we have the third line. The proof is completed.

Now we start the proof of Theorem 3:
Proof. We follow the same strategy in the proof of Theorem 10 in Ref. [54]. We will show the upper bound at first.
Proof of the upper bound.-At the very beginning, consider an arbitrarily given positive integer l ∈ N. By definition, there exists a channel E l such that E l − E ≤ 2( − η) and Also, because we have where in the third line we use Fact E.2 and the fourth line is due to the fact that λ < 1 is forbidden in the minimization range (otherwise there exist quantum states σ and σ such that λσ − σ ≥ 0 for some λ < 1, which is impossible since this implies 0 ≤ tr(λσ − σ ) = λ − 1 < 0). Let U i be the pair-wise permutation unitary channel between the first and the ith subsystems (that is, the swap unitary between the two subsystems). Then we consider the destruction process withΛ 6 , which gives the following: From Eq. (61) we note that when log 2 n >P Dmax (E l ) + 1 l + 2 log 2 1 2η [which automatically implies P Dmax (E l ) < ∞], there always exists an p l ∈ (0, 1) such that • log 2 n > log 2 1 p l + 2 log 2 1 2η .
•Λ l − p l E l is completely-positive. 6 Note that in general the two pair-wise permutation channels are different because they may act on different spaces. Here we simply use the same notation to stress the fact that both of them are permutation unitary channels between the first and the ith subsystems. More precisely, ifΛ l , E l : S → S , then we have S ⊗n → S ⊗n for the pre-processing permutations and S ⊗n → S ⊗n for the post-processing permutations.
By defining α l := 1 1−p l (Λ l − p l E l ), one can see that α l is completely-positive and trace-preserving [since bothΛ l and E l are trace-preserving and we haveΛ l = p l E l + (1 − p l )α l ; note that p l < 1]. This means α l is also a quantum channel (i.e. a completely-positive trace-preserving map), thereby having α l = 1. Then Lemma E.1 (with α = E l and β =Λ l ) implies that when log 2 n >P Dmax (E l ) + 1 l + 2 log 2 1 2η holds, then we have in other words, Eq. (62) forms an η-destruction process for E l . (Note thatΛ ⊗n l ∈ O N R since we assume no activation property). This also implies the existence of an -destruction process for E since where we use the relation E − E l ≤ 2( − η), data-processing inequality, and triangle inequality.
Finally, let n = min n ∈ N | log 2 n >P Dmax (E l ) + 1 l + 2 log 2 1 2η . Since C R (E) := min log 2 n and the minimization is taken over all -destruction processes, we conclude the following and the proof of the upper bound is completed since the above estimate works for all positive integer l. Proof of the lower bound.-The proof is completely the same with the proof of the lower bound of Theorem 10 in Ref. [54], and we briefly sketch it. Consider a given E S ∈ O R . Then for a given -destruction process of R-preservability consisting ofΛ S ∈ O N R and where Λ SS ∈ O N R and N i : This -destruction process of R-preservability can also be interpreted as an -destruction process defined in Ref. [54] by identifying O R as free channels in their framework (that is, in the proof of the lower bound of Theorem 10 in Ref. [54] and consider the channel resource theory with free channels as O R .). The same proof applies until we reach the following inequality, which is the last inequality in the bottom of page 15 in Ref. [54] (the assumptions made in Theorem 3 plus the properties (R1), (R2), (R3), (R4) ensure the applicability of the proof of Theorem 10 in Ref. [54] when we identify O R as free channels in their setting): where M i 's are completely-positive maps satisfying K i=1 p i M i = Λ SS , which means p i M i ≤ Λ SS for all i (by writing E ≤ E for two channel E and E we means E −E is completely-positive). Hence, we have Note that the left-hand-side is a channel. Because for any R-theory considered in this work the set O N R is by definition convex, we have 1 Using Fact E.3, we conclude thatP √ (2− ) Dmax (E S ) ≤ log 2 K for all possible K. This completes the proof.

F Proof of Theorem 4
Before the proof, we need to recap certain key ingredients in Ref. [76]. The first one is a central assumption called energy subspace condition (we use the notation m = (m 1 , m 2 , ..., m d ) to denote a vector in N d ):

Definition F.1. (Energy Subspace Condition) [76] A given Hamiltonian H with energy levels
Roughly speaking, Definition F.1 means energy levels cannot be integer multiples of each other. This condition also forbids the possibility of degeneracy (otherwise one can simply switch the coefficients of a vector m in a subspace with degeneracy to construct a counterexample).
Before mentioning the main results in Ref. [76], we define the smooth max-relative entropy as [also recall Eq. (14)] Note that there is a difference of 1 2 factor compared with Eq. (11) in Ref. [76]. Then we have [76] (γ is the thermal state associated with the given bath temperature T and system Hamiltonian H S ): Theorem F.2. [76] For a given state ρ S , we have Moreover, if the system Hamiltonian H S satisfies the energy subspace condition and ρ S is diagonal in the energy eigenbasis of H S , then we also have Now the idea is to use the above theorem to prove Theorem 4. But before the proof, we still need to establish the following lemma regarding the continuity of the max-relative entropy (in a finite dimensional case, we say a quantum state is full-rank if it has only positive eigenvalues; in other words, its support coincides with the whole Hilbert space): Lemma F.3. Given three states ρ, ρ , σ and σ is full-rank. Then we have where p min (σ) is the smallest eigenvalue of σ.
As a remark, we note that the above lemma implies the Lipschitz continuity of the function 2 Dmax(·||σ) when σ is full-rank. With Theorem F.2 and Lemma F.3 in hand, we now start the proof of Theorem 4.
Proof. When we consider the R-theory of athermality equipped with Gibbs-preserving maps, the only absolutely R-annihilating channel (with output space A and output dimension d k ) is the full thermalization channel Φ γ A : (·) → γ A , where γ A = γ ⊗k and γ is the given thermal state in the system S. Then direct computation shows [recall Eqs. (9) and (11) for notations] where the last equality follows from the fact that for an operator A and a positive operator E we have A ≥ 0 if and only if A ⊗ E ≥ 0, which implies the relation D max (ρ ⊗ η||σ ⊗ η) = D max (ρ||σ) for all quantum states ρ, σ, η. Together with Theorem F.2 and the quantity defined in Eq. (24), we conclude that and the upper bound is proved. To see the lower bound, first we note that being coherence-annihilating for the given Gibbspreserving channel N means N (ρ) is diagonal in the energy eigenbasis for all inputs ρ. Applying Theorem F.2 and Lemma F.3, one can conclude that and the proof is completed.

G Non-Signalling Assisted Classical Communication Scenario
Recently, the application of a channel resource theory to classical communication scenario has been addressed [59]. The central question is how much classical information (in terms of classical bits) can be transmitted noiselessly (or up to certain error) via a channel of the following form: where N is the noisy channel which is sending the information, and E e , E d are the encoding and the decoding channels, respectively. Note that we have E e : C → SA and E d : SA → C, where C is the space for the classical bits, which are represented by the orthonormal basis {|m } M −1 m=0 . As explained in Ref. [59], this structure gives the non-signalling assisted classical communication scenario: The transmission of the classical information is assisted by any possible non-signalling structure, which has the general form given by Eq. (83). In this case, both E e , E d can be arbitrary channels.
To quantify how much classical information is transmitted successfully within a given error ∈ (0, 1), we consider the following averaged error associated to a given combination of encoding channel E e , decoding channel E d , and transmitting channel N [59]: Then one can define the corresponding single-shot classical capacity with error as [59]: Note that in the non-signalling assisted scenario, the above optimization is taken over all possible channels E e , E d . More precisely, the encoding map E e can be understood as an effectively "classicalquantum" channel because we only consider inputs of the form |m m|. Similarly, the decoding map E d can be interpreted as an effectively "quantum-classical" channel. The above classical capacity indicates the optimal amount of classical bits that can be transmitted and recovered within the given error when the only constraint is the non-signalling condition. From here one can observe that the setup in Sec. 5.1.2 is equivalent to the above setup plus the two imposed thermodynamic constraints.

H Proof of Theorem 5
The strategy is to follow the spirit of the proof of Theorem 3 in Ref. [59]. Before the main proof, we need to establish two facts. First, recall from Eq. (28) the following expression for a given combination (N , E e , E d , γ A ): Here we note that γ A = γ ⊗k for a positive integer k, where γ is the thermal state associated with the given R-theory of athermality in the main system S. Again, Φ γ A : (·) → γ A is the full thermalization (or, equivalently, the state preparation channel) with the target thermal state γ A . We remark that throughout this section we assume the channels N , N have the same output space S. Now, we note the following result, which is similar to Lemma 4 in Ref. [59]: without loss of generality. For every positive integer k ∈ N, let E (k) be the Gibbs-preserving map satisfying where , † is a completelypositive unital map. Following the proof of Lemma 4 in Ref. [59], we note the following estimate: Then we conclude that where the last inequality follows from the data-processing inequality of the trace norm (or, equivalently, the contractivity under quantum channels). Since this argument works for all positive integer k, the desired upper bound is proved.
We still need to show another fact, which is similar to Theorem 5 in Ref. [41]: Proof. Follow the proof of Theorem 5 in Ref. [41], we first recall from Eq. (15) that with a Rtheory of athermality we have (note that the output space of the channel N is S, and γ is the corresponding thermal state) where the maximization is taken over all channels C. Then for every k ∈ N, there exists a channel C k and a value q k such that Then we have (note that N = 1 which is true for all k ∈ N. This means the desired upper bound.
Now we are ready to prove Theorem 5.
Proof. For a channel N satisfying N − N ≤ 2δ with 0 < δ < 1, we have the estimate N ⊗ Φ γ A − N ⊗ Φ γ A ≤ 2δ due to the data-processing inequality of the trace norm. Suppose the combination (N , E e , E d , γ A ) satisfies ε(N , E e , E d , γ A ) ≤ for a given 0 < < 1. Together with Facts H.1 and H.2, we have This implies log 2 M ≤P Dmax (N ) + log 2 1 1− −δ . Since the argument works for all possible M and N , this means the desired bound.

I Proof of Theorem 6
In this appendix, we will show a proposition which has Theorem 6 as a direct corollary. Before the main proof, we first prove the following fact for the R-theory of entanglement (and we write R = E).
and there exists an input state ρ 0 such that Λ(ρ 0 ) is entangled. In particular, this means Λ k (ρ 0 ) − Λ(ρ 0 ) 1 → 0 when k → ∞; in other words, we can use the sequence {Λ k (ρ 0 )} ∞ k=1 consisting of only separable states to approach Λ(ρ 0 ) in the trace norm · 1 . Because the set of separable states is closed in · 1 , we conclude that Λ(ρ 0 ) is separable, which is a contradiction. Hence, O N E is closed in · , and the proof is completed. Now we state the following result: Proposition I.2. For a given pair of thermal states (γ A , γ B ), if there exists an entanglement preserving local thermalization to (γ A , γ B ), then for every δ > 0 there exists another entanglement preserving local thermalization E to (γ A , γ B ) such that Proof. Let L 0 be an EPLT to (γ A , γ B ), and again let Φ ρ : (·) → ρ be the constant map with the output state ρ. Then consider the following convex mixture where p ∈ [0, 1]. This map is by definition a local thermalization. Then one can see that L(p) is continuous on p with the diamond norm because of For a given δ > 0, by choosing the corresponding L(p) will be an EPLT to (γ A , γ B ) due to the fact that this channel will not be in O N E , and satisfies the desired propertȳ In Ref. [63] it was shown that bipartite EPLTs exist for every positive local temperature and finite-energy local Hamiltonian (i.e. for every pair of full-rank local thermal states). Hence, we directly conclude that: Corollary I. 3. For every full-rank γ A , γ B and every δ > 0, there exists an entanglement preserving local thermalization E to (γ A , γ B ) such thatP · 1 (E) < δ.

J Alternative Entanglement Preserving Local Thermalization
In this section, we provide a new family of EPLTs, which can be further proved to admit arbitrarily small entanglement preservability and preservation of free entanglement simultaneously at the finite temperatures (Theorem 7).
We construct this new family of EPLTs in the bipartite system AB with equal finite local dimensions indicated as d. Given a positive value δ i ∈ [0, 1] with integer i ∈ [0, d − 2], we define the following map on the local system X: where we introduced the notation |n := |d − 1 − n and E X n := E X d−1−n , and the local Hamiltonians are given by H . Now we define the following family of channels (dependent of δ X i ) acting on a local system: In Appendix J.1 we prove that E X induces a local thermalization for an appropriate choice of δ X i . More precisely, with the (U ⊗ U * )-twirling operation T defined in Eq. (37) we have: We remark that the proof of the above lemma is constructive, hence E X is explicitly known [Eq. (132)]. For a given pair of single party thermal states (γ A , γ B ), we then consider the following map: where ∈ [0, 1] is a probability parameter whose value will be determined later. By Lemma J.1, we let ( E A ⊗ E B ) • T locally thermalize the system X to the following state for X = A, B [63]: One can then use exactly the same proof of Theorem 2 in Ref. [63] to show that where p min is the smallest eigenvalue among γ A and γ B . Finally, a direct computation of fully entangled fraction defined in Eq. (36) shows Since F max (ρ) > 1 d implies ρ is entangled [1], we conclude: is an EPLT when p min > 1 d 2 . This shows Eq. (105) admits EPLTs when we select the highest value. It turns out that Eq. (105) can achieve EPLTs even with arbitrarily small value. We will use this property to prove the main result in Theorem 7.

J.1 Proof of Lemma J.1
Because we will apply mathematical induction several times in the proof, it is convenient for us to adapt the following inverse energy representation. Let {|n } d−1 n=0 be the energy basis for the given local system Hamiltonian, and we assume the corresponding energies E n satisfies 0 ≤ E 0 ≤ E 1 ≤ ... ≤ E d−1 . Define |n := |d − 1 − n and E n := E d−1−n , which means now the ground state is |d − 1 , and the corresponding energy is E d−1 . In particular, we have the hierarchy In what follows, we also adapt the notations ∆ , which are regarded as vectors in [0, 1] (d−1) and [0, 1] 2(d−1) , respectively. In this line, we further define E ∆ AB In this appendix we use AB to emphasize the bipartition, and we always consider equal finite local dimensions indicated as d; that is, the global system can be written as C d ⊗ C d . Now we prove the following result, which has Lemma J.1 as a direct corollary: A, B). Then there exist a vector ∆ AB d−2 whose components are given by where Γ X n−1 := 1 + n−1 i=0 n−1 j=i δ j if n > 0 and Γ X −1 := 1, such that for all ρ we have As a remark, we note that ∆ AB d−2 is uniquely determined by η X due to Eq. (109).

Proof. (Proof of Lemma J.3.)
Recall that T (ρ) = ρ iso (p) for some p value [Eq. (39)]. We first prove the case when p = 0. By using the property of isotropic state, we can prove the result for arbitrary p value. Let us start with the following fact: Fact J.4. For the local system X, we have where Γ X i := 1 + i n=0 i j=n δ X j and we define Γ X −1 := 1. Proof. Let us use mathematical induction to prove the following formula for all n ∈ Z d−2 : First, direct computation can prove the case for n = 0, 1. Now, let us assume the correctness of the above formula for n in Z d−3 and compute the result for n + 1: The result follows by observing the following recursion relation:

J.2 Remarks
Here we make some remarks. First, note that Fact J.5 can apply on arbitrary single party thermal state. As another remark, we note that for a given η X = d−1 n=0 Q X n |n n|, there is a uniquely determined vector ∆ X d−2 which can realize it. To find this vector ∆ X d−2 , one can start from δ X 0 , which is given by After determining δ X 0 , one can determine δ X 1 , which is given by In general, one can determine δ X n by the following formula: this is because after knowing δ X i for 0 ≤ i ≤ n − 1, one can directly compute Γ X n−1 .

K Proof of Theorem 7
To prove Theorem 7, first we prove Eq. (33) in Appendix K.1. As the next step in Appendix K.2 we prove a lemma, which is a preliminary result for the proof of Eq. (34) given in Appendix K.3.
In the proof of Eq. (33), we will use the EPLT candidate constructed in Ref. [63], which is given by: where the (U ⊗ U * )-twirling T is defined in Eq. (37), Φ ρ (·) = ρ is the constant map, and η X is defined in Eq. (106). E (γ A ,γ B ) is proved to be an EPLT to (γ A , γ B ) for all full-rank γ A and γ B [63]. As a remark, we note that Eq. (133) can be interpreted as the twirling operation T followed by a partial thermalization channel (·) → (1 − )η A ⊗ η B + (·), which can be thought as a generalized depolarizing channel with finite local temperatures (captured by the thermal states η A and η B ).

K.1 Proof of Eq. (33)
Proof. We compute the lower bound for the map E (γ A ,γ B ) defined in Eq. (133) (for a channel E we write E ∞ := sup ρ E(ρ) ∞ ): where the fourth line follows from the inverse triangle inequality of the trace norm. Using the Now we bound inf Λ∈O N FE T − Λ 1 . Denoting by ρ b an arbitrary state which is not free entangled and by ρ iso an arbitrary isotropic state [defined in Eq. (39)], we have where we note that · 1 := tr| · | ≥ · ∞ := sup |ψ | ψ|(·)|ψ | and the last equality is due to the sufficient condition Ψ + d |ρ|Ψ + d > 1 d of distillability [65] for a quantum state ρ. Hence, The strongest bound is achieved by taking the largest allowed by Eq. (107), giving The proof is completed.

K.2 Preliminary for the Proof of Eq (34)
As the first step, we show the following lemma: is a separable state. Write max X=A,B γ X − I d ∞ < δ 0 for a given small positive value δ 0 , which implies max X=A,B η X − I d ∞ < 2δ 0 for ∈ 0, 1 2 . By Lemma J.1 and Appendix J.1, since the vector ∆ AB d−2 is uniquely determined by (η A , η B ), max X=A,B η X − I d ∞ < 2δ 0 will imply the existence of a small value δ 1 such that ∆ AB d−2 < δ 1 (one can see this by the structure of E ∆ X d−2 given in Appendix J.1). Hence, the continuity implies the existence of a small positive value δ = δ(δ 0 ) such that Ψ and δ can be as small as we want by choosing a proper δ 0 . Now we note the following property of normalized temperature: γ X → I d if τ X → ∞; in other words, for a given value ∆, there exists a normalized temperature threshold τ ∆ such that max X=A,B γ X − I d ∞ < ∆ if min X=A,B τ X > τ ∆ . Together with this property of normalized temperature, for a given k ∈ N, there exists a ∆ k such that Ψ Finally, for a given ∈ [0, dp min ] we have the following estimate if min X=A,B τ X > ∆ k :

K.3 Proof of Eq (34)
Proof. We show that E (γ A ,γ B ) given in Eq. (105) can be arbitrarily close to the set O N E (here E denotes entanglement) while preserving free entanglement for certain entangled input states. For any given δ > 0, there exists an ∈ (0, 1] small enough such that × T − E A ⊗ E B • T < δ. Lemma K.1 implies there exists τ ∈ (0, +∞) such that for every pair (γ A , γ B ) with min X=A,B τ X > τ , E (γ A ,γ B ) is an EPLT to (γ A , γ B ) that can preserve free entanglement and achieves where sup A;ρ SA is optimizing over all the ancillary system A and states ρ SA on the system SA. By redefining τ to be the τ δ given in the statement of the theorem, the proof is completed.