Beyond the thermodynamic limit: finite-size corrections to state interconversion rates

Thermodynamics is traditionally constrained to the study of macroscopic systems whose energy fluctuations are negligible compared to their average energy. Here, we push beyond this thermodynamic limit by developing a mathematical framework to rigorously address the problem of thermodynamic transformations of finite-size systems. More formally, we analyse state interconversion under thermal operations and between arbitrary energy-incoherent states. We find precise relations between the optimal rate at which interconversion can take place and the desired infidelity of the final state when the system size is sufficiently large. These so-called second-order asymptotics provide a bridge between the extreme cases of single-shot thermodynamics and the asymptotic limit of infinitely large systems. We illustrate the utility of our results with several examples. We first show how thermodynamic cycles are affected by irreversibility due to finite-size effects. We then provide a precise expression for the gap between the distillable work and work of formation that opens away from the thermodynamic limit. Finally, we explain how the performance of a heat engine gets affected when one of the heat baths it operates between is finite. We find that while perfect work cannot generally be extracted at Carnot efficiency, there are conditions under which these finite-size effects vanish. In deriving our results we also clarify relations between different notions of approximate majorisation.


I. INTRODUCTION
Background. Thermodynamics forms an integral part of contemporary physics, providing us with invaluable rules that govern which transformations between macroscopic states are possible and which are not [1]. In modern parlance thermodynamics is an example of a resource theory [2]. These provide us with a general framework to study the question of state interconversion by leveraging the structure imposed by the free states and operations within a given theory. The resource-theoretic approach has recently garnered renewed attention in the field of quantum information (see [3] for a recent review) and allows us to quantify notions like entanglement [4], coherence [5][6][7] and asymmetry [8,9]. In the study of entanglement, for example, local operations and classical communication constitute the free operations and separable states are treated as free states. In an analogous way, a rigorous resource-theoretic formulation of quantum thermodynamics was provided in Refs. [10][11][12], with thermal Gibbs states being free and the laws of thermodynamics being captured by the restricted set of free operations known as thermal operations.
One of the main problems studied within the resource theory of thermodynamics is single-shot state interconversion, i.e., identifying when it is possible to convert a given state to another using only free operations (or, alternatively, identifying what extra resources are necessary to enable such transformations). Although for general quantum states only partial results are known [13][14][15] (see also Ref. [16] for the most recent progress), the * christopher.chubb@sydney.edu.au full solution was found for the restricted case of transforming states with no coherence between distinct energy eigenspaces. For such energy-incoherent states the necessary and sufficient conditions for single-shot state interconversion are given by a thermomajorisation relation between the initial and final states [11]. As a result the allowed transformations are in general irreversible, which is captured by the fact that the amount of work needed to create a given state, the work of formation, is larger than the amount of work one can extract from it, the distillable work. We note that formally the thermomajorisation condition is strongly related to the majorisation condition appearing within the resource theory of entanglement when studying transformations between pure bipartite states [17].
An important variant of the interconversion problem, lying on the opposite extreme of the single-shot case, is asymptotic state interconversion. In this case one considers having access to arbitrarily many copies of the initial state, and asks for the maximal conversion rate at which it is possible to transform instances of one state to another with asymptotically vanishing error. It was found that this rate is given by the ratio of non-equilibrium free energies of the initial and target states [12]. Thus, the asymptotic interconversion rate is directly linked with the amount of useful work that can be extracted on average from a given state. Moreover, in this regime all transformations become fully reversible, as work of formation and distillable work asymptotically coincide. Again, this result closely resembles that obtained in pure state entanglement theory, where the optimal interconversion rate is given by the ratio of the entanglement entropies of the initial and target states [18].
In this work we study the interconversion problem arXiv:1711.01193v2 [quant-ph] 16 Mar 2018 in an intermediate regime, between the single-shot and asymptotic cases described above. We thus consider transformations of a finite number n of instances of the input state and tolerate a non-zero error, which affects the optimal conversion rate. By developing the notions of approximate majorisation [19] and thermomajorisation, we extend a formal relationship between the resource theories of pure state entanglement and thermodynamics of incoherent states to the approximate case. This allows us to adapt recently developed tools for approximate entanglement transformations in Ref. [20] to study corrections to the asymptotic rates of thermodynamic transformations, the so-called second-order asymptotics (see, e.g., Refs [21][22][23][24][25][26][27][28] for other recent studies of second-order asymptotics in quantum information). The crucial technical difference between Ref. [20] and our work is the reversed direction of the majorisation relation, resulting in free states being given by uniform states rather than pure states. The second-order corrections were known to scale as 1/ √ n [12], and our main technical contribution is identifying the exact constant (including its dependence on the error) and interpreting its thermodynamic meaning.
Motivation. Probably the most famous thermodynamic result concerns the irreversible nature of thermodynamic transformations, and is often captured by the oversimplified statement "entropy has to grow". The dynamics of a system interacting with a thermal bath is irreversible since transformations performed at finite speed lead to heat dissipation, resulting in a loss of information about the system. One thus often studies idealised scenarios, when the system undergoes changes so slowly that it stays approximately in thermal equilibrium at all times. In this quasi-static limit one recovers reversible dynamics. However, thermodynamic reversibility actually requires one more assumption that is usually made implicitly. Namely, the thermodynamic description is only valid when applied to systems whose energy fluctuations are much smaller than their average energy. This is true for macroscopic systems composed of n → ∞ particles, in the so-called thermodynamic limit, and is reflected by the reversibility of the interconversion problem in the asymptotic limit. However, for finite n the macroscopic results do not hold anymore, leading to another source of irreversibility.
In the emerging field of quantum thermodynamics (see Ref. [29] and references therein) a focus is placed on possible transformations of small quantum systems interacting with a thermal environment. The necessity to go beyond classical thermodynamics is motivated by the fact that at the nanoscale quantum effects, like coherence [13,14,[30][31][32][33] and entanglement [34][35][36], start playing an important role. However, beyond these phenomena, in the quantum regime one also deals with systems composed of a finite number n of particles. Hence, thermodynamic transformations of such systems are affected by the effective irreversibility discussed in the previous paragraph. The results we present in this paper provide a mathematical framework to rigorously address this problem. We thus provide a bridge between the extreme case of single-shot thermodynamics with n = 1 and the asymptotic limit of n → ∞, allowing us to study the irreversibility of thermodynamic processes in the intermediate regime of large but finite n.
Main Results. In order to state our main results we first need to introduce some concepts that will be defined more formally in Sections II and III. Let us consider a finite-dimensional quantum system, characterised by its Hamiltonian H, in the presence of a thermal bath at fixed temperature T . The initial state of the system is in general out of thermal equilibrium, and the bath can be governed by an arbitrary Hamiltonian. Energy-conserving operations that interact the system with a bath in thermal equilibrium are then known as thermal operations. A simple example of a thermal operation just swaps the system with a bath (governed by the same Hamiltonian), replacing the initial state with a thermal Gibbs state. Gibbs states for H, denoted by γ, are thus free states in the resource-theoretic formulation of thermodynamics.
In the following we focus on a system comprised of a finite number n of non-interacting subsystems, each governed by the Hamiltonian H. Let us consider a pair of subsystem states ρ and σ that both commute with H. 1 Our results will be expressed in terms of two information quantities: the relative entropy [37] with the Gibbs state, D(· γ), and the relative entropy variance [21,22] with the Gibbs state, V (· γ). These quantities can also be interpreted thermodynamically: k B T · D(ρ γ) is the difference between the generalised free energies of ρ and γ (with k B denoting the Boltzmann constant); and V (ρ γ) is proportional to a generalised heat capacity of the system. The latter interpretation is justified since for ρ = γ being a Gibbs state at temperature T = T , the quantity V (γ γ) is proportional to the heat capacity at T . We also note that D(ρ γ) vanishes if and only if ρ = γ, whereas V (ρ γ) vanishes whenever ρ is proportional to the Gibbs state on the support of ρ, e.g., when ρ is pure.
Let us now consider the problem of thermodynamic state interconversion between a finite number of instances of a state and for a fixed inverse temperature β of the background bath. Formally, we are looking for the maximal rate R for which there exists a thermal operation E β such that E β (ρ ⊗n ) =σ for some stateσ on Rn subsystems that is sufficiently close to σ ⊗Rn . To measure the proximity of two quantum states we will use infidelity, i.e. we require that F (σ ⊗Rn ,σ) ≥ 1 − for some accuracy parameter ∈ (0, 1), where F (·, ·) denotes Uhlmann's fidelity [38]. The maximal conversion rate, denoted by R * (n, ), depends on both the number of subsystems n and the accuracy . We can assume that neither the initial state ρ nor the target state σ are the thermal state γ, as otherwise the interconversion problem is trivial. We then find the following expansions of R * (n, ) in n: is the inverse of the cumulative function of Rayleigh-normal distribution Z ν introduced in Ref. [20] with ν given by and denotes equality up to terms of order o(1/ √ n). We note that Z 0 = Φ is the cumulative normal distribution function and Z 1 is the cumulative Rayleigh distribution function. The inverse of the cumulative Rayleigh-normal distribution is typically negative for small values of (unless ν = 1), and thus the finite-size correction term that scales as 1/ √ n is generally negative. For the special case V (ρ γ) = V (σ γ) = 0 (when ν is undefined) we provide an exact formula for R * (n, ), up to all orders in n.
In deriving our results we also prove an important relation between two different notions of approximate majorisation [19]. More precisely, we show that pre-and post-majorisation, which hold when the majorisation relation holds up to the smoothing of the majorising or majorised distribution, are equivalent. We further extend these concepts to thermomajorisation, which allows us to rigorously address the problem of approximate thermodynamic transformations.
Discussion. One of the main applications of our result is to the study of thermodynamic irreversibility. In the asymptotic limit, n → ∞, the optimal conversion rate R * from ρ to σ is equal to the inverse of the conversion rate from σ to ρ [12]. We can thus transform ρ ⊗n through σ ⊗R * n back to ρ ⊗n , so that the rate of concatenated transformations R * r is equal to 1 and the process can be performed reversibly. However, using Eq. (1a) twice, one finds the correction term to reversibility rate R * r , which is proportional to 1/ √ n. Moreover, if ν = 1 this correction term is negative for small errors. In fact, it diverges when the error approaches zero, preventing a perfect reversible cycle. However, pairs of states with equal ratios of relative entropy and relative entropy variance with respect to the thermal state (such that ν = 1) are reversibly interconvertible up to second-order asymptotic corrections, mirroring a recent result in entanglement theory [39]. Thus, ν can be interpreted as the irreversibility parameter that quantifies the amount of infidelity of an approximate cyclic process.
One particular consequence of the discussed irreversibility is the difference between the distillable work, W D , and the work of formation, W F , for a given state ρ [11]. The former is defined as the maximal amount of free energy in the form of pure energy eigenstates ψ that can be obtained per copy of ρ; the latter as the minimal amount of free energy in the form of pure energy eigenstates ψ needed per copy to create the target state ρ. We note that in the special case when ψ is chosen to be the ground state, the distillation process can be considered as Landauer erasure (resetting to zero energy pure states), whereas the formation process can be seen as the action of a Szilard engine (creating states out of information). In single-shot thermodynamics W D and W F were shown to be proportional to max-and min-relative entropies with respect to the thermal state [11]; while in the asymptotic scenario they are both equal to W = k B T · D(ρ γ), the non-equilibrium free energy of a state [12]. Here, using appropriately modified Eqs. (1a) and (1b), we show that for large n the values of distillable work and work of formation per particle lie symmetrically around the asymptotic value, W D W − ∆W and W F W + ∆W , and provide the exact expression for the gap ∆W . Moreover, in the special case when the investigated state ρ is itself a thermal state at some temperature different from the background temperature, ∆W can be directly related to the relative strength of energy fluctuations of the system. Finally, we also explain how one can investigate the performance of heat engines with finite baths [40][41][42][43] using an appropriately chosen interconversion scenario. This allows us to study finite-size corrections to the efficiency of a heat engine and the quality of work it performs. More precisely, we consider a heat engine operating between an infinite background bath at temperature T h and a finite colder bath composed of n particles at temperature T c . In particular, we show that, unless the irreversibility parameter ν = 1, near-perfect work can be performed only with efficiency lower than the Carnot efficiency η C . However, allowing for imperfect work allows one to achieve and even surpass η C [44]. Moreover, we find that it is possible for a finite bath to have two thermal states at different temperatures, T c and T c , such that the irreversibility parameter for them is equal to 1. Thus, in a particularly engineered setting, it is possible to achieve Carnot efficiency and perform perfect work, while the finite bath changes temperature from T c to T c .
Overview. The remainder of this paper is organised as follows. We first describe the resource-theoretic approach to thermodynamics in Section II and introduce necessary mathematical concepts used within the paper in Section III. In Section IV we state our main result concerning state interconversion under thermal operations, and discuss its thermodynamic interpretation and possible applications. We then proceed to Section V, where we present auxiliary results concerning approximate majorisation and thermomajorisation, which we believe may be of independent interest. The technical proof of the main result can be found in Section VI. We conclude with an outlook in Section VII.

A. Thermal operations
We begin by describing the resource-theoretic approach to the thermodynamics of finite-dimensional quantum systems in the presence of a single heat bath at temperature T [11,45]. The investigated system is described by a Hamiltonian H = i E i |E i E i | and prepared in a general state ρ, whereas the bath, with a Hamiltonian H B , is in a thermal equilibrium state, where β = 1/k B T is the inverse temperature with k B denoting the Boltzmann constant. 2 The evolution of the joint system is assumed to be closed, so that it is described by a unitary operator U , which additionally conserves the total energy, The central question now is: what are the possible final states that a given initial state ρ can be transformed into? More formally, one defines the set of thermal operations [10], which describes the free operations of the resource theory of thermodynamics, i.e., all possible transformations of the system that can be performed without the use of additional resources (beyond the single heat bath). These are defined as follows: Definition 1 (Thermal operations). Given a fixed inverse temperature β, the set of thermal operations {E β } consists of completely positive trace-preserving (CPTP) maps that act on a system ρ with Hamiltonian H as with U satisfying Eq. (4), γ B given by Eq. (3), and H B being arbitrary.
Note that energy conservation condition, Eq. (4), can be interpreted as encoding the first law of thermodynamics; whereas the fact that the bath is in thermal equilibrium leads to E β (γ) = γ, with γ being the thermal Gibbs state of the system (i.e., given by Eq. (3) with H B replaced by H), thus encoding the second law.

B. Thermodynamic state interconversion
The thermodynamic interconversion problem is stated as follows: given a system (i.e., fixing H) together with initial and target states, ρ and σ, does there exist a thermal operation E β (for a fixed β) such that E β (ρ) = σ? The general answer for such a question is not known beyond the simplest qubit case [13,14] (however, we note that the problem has very recently been solved for a larger class of free operations given by generalised thermal processes [16] for coherent state interconversion). Nevertheless, for a restricted problem involving only energy-incoherent states, i.e., ρ and σ commuting with H, the set of necessary and sufficient conditions was found [11]. First, note that within this incoherent subtheory a quantum state can be equivalently represented by a probability distribution. For a non-degenerate Hamiltonian H, the initial and target states, ρ and σ, that commute with H are diagonal in the energy eigenbasis, so we can identify them with probability distributions p and q, Hamiltonians we note that unitaries within a degenerate energy subspace are thermal operations, so one can always diagonalise a state within such subspace for free. Therefore, in a general case the components of p and q representing ρ and σ are simply given by the eigenvalues of ρ and σ. Next, in Ref. [10] (see also Refs. [11,46] for an expanded discussion) the existence of a thermal operation between incoherent states was linked to the existence of a particular stochastic map via the following theorem.
Theorem 2 (Theorem 5 of Ref. [10]). Let ρ and σ be quantum states commuting with the system Hamiltonian H, and γ its thermal equilibrium state. Denote their eigenvalues by p, q and γ, respectively. Then there exists a thermal operation E β such that E β (ρ) = σ if and only if there exists a stochastic map Λ β such that As a result, studying thermodynamic interconversion problem between energy-incoherent states, one can replace CPTP maps and density matrices with stochastic matrices and probability vectors. We will fully address this simplified problem in Section III.
In this paper we study a particular variant of the general interconversion problem: the limit of asymptotically many copies of input and output states. Informally, we want to find the optimal rate R * allowing one to transform n copies of an energy-incoherent state ρ into R * n copies of another energy-incoherent state σ, as n becomes large. Since the dimension of the input and output spaces must be the same 3 , we note that one can append any number of states in thermal equilibrium, γ ⊗m , to both the initial state ρ ⊗n , and target state σ ⊗Rn . Physically, it is motivated by the fact that thermal states are free resources; mathematically, it comes from the fact that the bath Hamiltonian H B is arbitrary (so, in particular, it may contain m copies of the system, effectively adding γ ⊗m to the initial state) and that transforming any copy of the system into γ is a thermal operation (so that the part of the final state beyond σ ⊗R * n can always be replaced by γ ⊗m ). Therefore, we ask for the maximal value of R * (in the limit n → ∞) for which there exists a thermal operation E β satisfying where ≈ denotes closeness in some distance measure, e.g. infidelity or trace norm. As already mentioned in the Introduction, our focus here is on R * for large but finite n, i.e., we look for corrections of order 1/ √ n to the optimal conversion rate coming from the finite number of systems involved in the thermodynamic process.

A. Majorisation and embedding
Unless otherwise stated we consider d-dimensional probability distributions and their products. We define the uniform state η and the thermal Gibbs state γ at inverse temperature β as with E i denoting the eigenvalues of H and Z = i e −βEi being the partition function of the system. Moreover, we we call a distribution f flat if all its non-zero entries are equal. Note that in the infinite and zero temperature limits, β → 0 and β → ∞ respectively, the thermal state γ becomes flat. Specifically, in the former case γ → η, and in the latter γ → s := [1, 0, . . . , 0] for Hamiltonians with non-degenerate ground spaces. The most general transformation between two probability distributions is given by a stochastic matrix Λ satisfying Λ ij ≥ 0 and i Λ ij = 1. We denote by Λ β a Gibbs-preserving stochastic matrix with a thermal fixed point, i.e., Λ β γ = γ. In particular, a matrix Λ 0 that preserves the uniform distribution η is also called bistochastic. A probability vector p is said to majorise q, denoted by p q, if and only if for all k ∈ {1, . . . , d}, with p ↓ denoting a probability vector with entries of p arranged in a non-increasing order. We then have the following central result that is used in the study of state interconversion: Theorem 3 (Theorem II.1.10 of Ref. [47]). There exists a bistochastic matrix mapping from p to q if and only if p q, i.e.
Definition 4 (Embedding map). Given a thermal distribution γ with rational entries, γ i = D i /D and D i , D ∈ N, the embedding map Γ β sends a d-dimensional probability distribution p to a D-dimensional probability distributionp := Γ β (p) as follow: The potentially irrational values of γ i can be approached with arbitrarily high accuracy by choosing D large enough.
Note thatγ = η D , i.e., embedding maps a thermal distribution into a uniform distribution over D entries, and that Γ β is injective, implying the existence of a left inverse (Γ β ) −1 . The action of (Γ β ) −1 on a D-dimensional vector r is given by summing up all the entries belonging to the same block of D i entries. Moreover, Γ β (Γ β ) −1 is a bistochastic map, that transforms each block of D i entries into a uniform distribution, i.e., given an index j belonging to a block D i , denoted by j ∈ [D i ], we have We can also introduce the embedded version of a matrix Λ β ,Λ Notice thatΛ β is a bistochastic matrix, as it clearly maps the set of D-dimensional probability distributions into itself, and it preserves the uniform state η D , Using the notion of embedding we can define thermomajorisation relation [11] (originally introduced in Ref. [48] as d-majorisation). A probability vector p is said to thermomajorise q, denoted by p β q, if and only if the majorisation relation holds between the embedded versions of p and q, i.e., p β q ⇐⇒p q. (15) Note that for β = 0 thermomajorisation becomes standard majorisation, as the embedding map is the identity matrix. Now, we have the following equivalence which, sinceΛ β is bistochastic, allows us to use Theorem 3 to obtain Interconversion equivalence. Quantum states ρ and σ are energy-incoherent, and their eigenvalues are given by p and q, respectively. The arrow between states (distributions) symbolises the existence of a given map.

Corollary 5 (Thermodynamic interconversion).
There exists a Gibbs-preserving matrix mapping from p to q if and only if p β q, i.e.
∃Λ β : Λ β γ = γ and Λ β p = q ⇐⇒ p β q. (17) Due to Theorem 2, Corollary 5 specifies the necessary and sufficient conditions for energy-incoherent state interconversion under thermal operations. In Fig. 1 we present the chain of equivalence relations leading to this result.

B. Information-theoretic notions and their thermodynamic interpretation
The relative entropy or Kullback-Leibler divergence of a probability distribution p with q is defined as whenever the support of q contains the support of p (otherwise the divergence is set to +∞). Denoting the average of a random variable X in a state p by we can introduce a random variable L with so that the divergence can be interpreted as the expectation value of the log-likelihood ratio, D(p q) = L p . Similarly, we define the corresponding variance, the relative entropy variance, as where The following equalities are an immediate consequence of the embedding map introduced in Eq. (11): D(p q) = D(p q) and V (p q) = V (p q) . (23) In this work we will mostly encounter these quantities in the special case when q = γ is the thermal distribution corresponding to inverse temperature β, Then, both D(p γ) and V (p γ) can be interpreted thermodynamically. First note that with E p being the average energy and denoting the Shannon entropy of p (as a function of a distribution p it should not to be confused with the Hamiltonian H). Now, recall that the classical expression for free energy reads U − T S, with U being the average energy of the system, T the background temperature and S the thermodynamic entropy; and that the free energy of the thermal state is −k B T log Z. We thus see that D(p||γ)/β can be interpreted as a non-equilibrium generalisation of free energy difference between an incoherent state ρ (represented by a probability distribution p) and a thermal state γ. Now, to interpret V (p γ) let us first introduce a covariance matrix M for the log-likelihood log p and energy in the units of temperature βE: where Cov p (X, Y ) = XY p − X p Y p . The relative entropy variance can then be expressed as In a particular case when the distribution p is a thermal distribution γ at some different temperature T = T , the expression becomes where is the specific heat capacity of the system in a thermal state at temperature T .
Finally, let us note that one can define quantum generalisations of both the relative entropy, D(ρ||σ) [37], and the relative entropy variance, V (ρ||σ) [21,22]. Moreover, for a quantum state ρ commuting with the Hamiltonian, these relative quantities with respect to the thermal state γ coincide with classical expressions, where p denotes the vector of eigenvalues of ρ.

C. Approximate interconversion
As already mentioned in Section II we will focus on approximate interconversions, allowing the final stateq to differ from the target state q, as long as it is close enough. We measure distance between states using the infidelity, where the fidelity (or Bhattacharyya coefficient) is We will also use the fidelity F between two (continuous) probability density functions f (x) and g(x), defined as The two important properties of the infidelity that we will use throughout the paper are as follows. First, since fidelity is non-decreasing under stochastic maps we have Second, the distance δ between two probability vectors is the same as between their embedded versions, i.e., which can be verified by direct calculation. Although we will be mainly concerned with "smoothing" the final distribution (allowing it to differ from the desired target one), it is useful to introduce two dual definitions of approximate majorisation and thermomajorisation.
Definition 6 (Pre-and post-thermomajorisation). A distribution p -pre-thermomajorises a distribution q, which we denote p β q, if there exists ap such that A distribution p -post-thermomajorises a distribution q, which we denote p β q, if there exists aq such that In a particular case of β = 0, when thermomajorisation coincides with majorisation, we will speak of pre-and post-majorisation, denoted by and , respectively.
Pre-and post-thermomajorisation. Arrows depict the existence of Gibbs-preserving maps between corresponding distributions, whereas -circles represent sets of probability distributions whose distance δ from p and from q is smaller than .
Let us make a few comments about the above definition. First, notice that due to Corollary 5, p β q means that in the vicinity of p there exists a statep and Λ β that maps it to q. Similarly, p β q means that there exists Λ β that maps p toq, which lies in the vicinity of q. We illustrate this in Fig. 2. Next note that both β 0 and 0 β reduce to thermomajorisation β , specifically 0 0 and 0 0 are equivalent to the standard majorisation relation . Let us also mention that the concept of majorisation between smoothed distributions has been recently studied in Refs. [19,49,50]. Moreover, as with exact thermomajorisation, approximate thermomajorisation specifies the existence of a thermal operation between two energy-incoherent states. More precisely, due to Theorem 2, Corollary 5 and Definition 6, we have the following: Corollary 7. Let ρ and σ be quantum states commuting with the system Hamiltonian H. Denote their eigenvalues by p and q, respectively. Then there exists a thermal operation E β such that δ(E β (ρ), σ) ≤ if and only if p β q.
Finally, let us make an important comment concerning approximate majorisation. Consider two distributions, p and q, such that p q. By definition there existsq close to q that is majorised by p. As majorisation is invariant under permutations, p also majorises any distribution Πq, where Π is arbitrary permutation. However, the fidelity between q and Πq is, in general, permutation-dependent. It is the largest, when the i-th largest entries of q and Πq coincide for all i, and so it is equal to F (q ↓ ,q ↓ ). Therefore, for a givenq satisfying some majorisation relation, we know that for every state q there exists Πq satisfying the same relation, and with Thus, in the context of approximate majorisation, while calculating fidelities between any two states we will assume, without loss of generality, that they are ordered.

D. Asymptotic notation
As we will be interested in approximating the optimal rate up to terms of order O(1/ √ n), we will adopt the following asymptotic notation for sequences {a n } n and {b n } n in n ∈ N.
where {f n } n and {g n } n are auxiliary sequences that we usually do not introduce explicitly.

E. Rayleigh-normal distributions
The dependence of the finite-size corrections to optimal interconversion rate on the infidelity is given by generalisations of the Gaussian distribution known as the Rayleigh-normal distributions. This family of distributions was first introduced in Ref. [20] in the context of LOCC entanglement conversion. In order to define it, let us first denote the Gaussian cumulative distribution function, with mean value µ and variance ν, by Φ µ,ν , As a shorthand notation we will also use Φ to denote Φ 0,1 . Following Ref. [20] we can now define Definition 8 (Rayleigh-normal distributions). For any ν > 0 the Rayleigh-normal distribution is a distribution on R, whose cumulative function is given by where the supremum is taken over all monotone increasing and continuously differentiable A : R → [0, 1] such that A ≥ Φ pointwise; and f (x) denotes the derivative of f (x).
We now present some relevant properties of the Rayleigh-normal distributions.
Lemma 9 (Section 2 of Ref. [20]). The Rayleigh-normal distributions have the following properties: • The ν → 0 case converges in distribution to the normal Gaussian, • The ν = 1 case reduces to the Rayleigh distribution of scale parameter σ = √ 2, Plots of Rayleighnormal cumulative probability distributions introduced in Ref. [20]. Note the difference between the above graphs and those presented in Fig. 1 of Ref. [20]. The parameter is chosen in the ranges (a) ν ∈ [0, 1] and (b) ν ∈ [1, ∞]. Due to the duality property, Eq. (45), the plots can be directly related to the ones presented in panel (a).
• The Rayleigh-normal distributions possess a duality under inversion of the parameter ν of the form As well as these properties, an explicit form for the Rayleigh-normal distribution can be given. If we define α µ,ν as the unique solution [20, Lemma 3] to and let then for ν > 1 we have Theorem 4]. Using the duality property, Eq. (45), a similar expression can be given for ν ∈ (0, 1). We present plots of Rayleigh-normal distributions for a few selected values of ν in Fig. 3.

A. Statement of the main result
We are now ready to state our main result concerning the second-order analysis of the approximate interconversion rates between independent and identically distributed (i.i.d.) states under thermal operations. We focus on initial and target states, ρ and σ, that commute with the Hamiltonian H, so that we can represent them as probability distributions, p and q, over their eigenvalues. For two fixed distributions p and q we will be interested in the trade-off between three parameters in the asymptotic n → ∞ regime: the rate of conversion R, the infidelity , and the inverse temperature of the bath β 4 . Specifically, we will be interested in the triples (β, , R) for which there exist Gibbs-preserving maps Λ β such that where γ denotes the Gibbs state at inverse temperature β. By Corollary 5 and Definition 6 this condition is equivalent to approximate post-thermomajorisation, and, by Corollary 7, there exists a thermal operation transforming ρ ⊗n into a state away in the infidelity measure from σ ⊗Rn . We then define the optimal interconversion rate R * β (n, ; p, q) and the optimal infidelity of interconversion * β (n, R; p, q) as When it is clear from context we will drop the explicit dependence on p and q. Our main result is then given by the following theorem.
Theorem 10 (Second-order asymptotic interconversion rates). Let ρ and σ be energy-incoherent initial and target states with eigenvalues given by p and q, respectively. Then, for inverse temperature β and infidelity ∈ (0, 1), the optimal interconversion rate has the following secondorder expansions is the irreversibility parameter. 4 Note that for fixed H the inverse temperature β fully specifies the thermal Gibbs distribution γ.
The full proof of Theorem 10 can be found in Section VI. Before presenting it, we will discuss some of its consequences and applications in Section IV B, and prove auxiliary results concerning approximate majorisation in Section V. But first, let us make a few technical remarks about the above theorem. Note that Eqs. (52a)-(52b) are simply related by the duality property of Rayleighnormal distribution, Eq. (45). The reason to state both formulas is that this way one covers each of the special cases, V (p γ) = 0 and V (q γ) = 0, avoiding the use of Z −1 ∞ , which is undefined. The special case when both relative entropy variances vanish is covered separately in Section VI B, where an exact expression for R * (n, ) is provided (the asymptotic expansion of which coincides with the appropriate limit of Eq. (52a)).
Furthermore, since all the involved states are energyincoherent, one can replace probability distributions p, q, γ in Theorem 10 with density matrices ρ, σ, γ. This way one can study interconversion between noncommuting states ρ and σ, as long as they both commute with H. For example, if the Hamiltonian is trivial, H ∝ 1, ρ and σ may be arbitrary states. Thus, Theorem 10 yields a complete second-order analysis of interconversion under noisy operations [51], as for trivial Hamiltonians thermal operations coincide with noisy operations.
Finally, using results originally derived in Ref. [52] (see Appendix B), one can numerically evaluate the optimal interconversion rates. In Appendix C we show that this algorithm can be executed with a runtime that is efficient in the system size. Using this, in Figures 4 and 10 we can compare our second-order expansion to the exact interconversion rates. We find that even for relatively small system sizes, the second-order asymptotic expansion gives a remarkably good approximation to the optimal interconversion rates, especially when compared to the first-order asymptotics.

B. Discussion and applications
Although general state interconversion may seem to be a rather abstract problem, we will now show how the formalism can be applied to study more familiar thermodynamic scenarios. Since asymptotic conversion rates allow for reversible interconversion cycles and the results presented in the previous section describe finite-size corrections to these rates, our considerations will mainly revolve around irreversibility. We will first quantify it directly, by calculating the rate at which n copies of a system can be transformed from initial state ρ, through σ, and back to ρ. We will then discuss the gap between work of formation and distillable work that opens when one processes finite number n of systems. Finally, we will apply our results to study the performance of heat engines in the presence of finite-size baths. Comparison between the second-order approximation R2 and exact thermal interconversion rates R * , when converting from ρ = 7 10 |0 0| + 3 10 |1 1| to σ = 8 10 |0 0| + 2 10 |1 1|, with Hamiltonian H = |1 1| and access to a thermal bath at temperature 1/β = 3. The circles indicate exact conversion rates (c.f. Appendix C), and the lines the second-order approximation given by Eq. (1). As the exact interconversion rate is always a multiple of 1/n, we have also indicated the rounding of the second-order approximation to the nearest multiples of 1/n with error bars. The colours indicate the infidelity tolerance, with = 5 × 10 −2 for red and = 10 −5 for blue. The dotted line indicates the asymptotic interconversion rate R1. We plot the results for n ≤ 20 in Figure 10.

Finite-size reversibility
We start by considering the following thermodynamic process with optimal interconversion rates given by where ρ and σ commute with the Hamiltonian and their eigenvalues are given by p and q, respectively. Without the second-order asymptotic corrections derived in this work, the reversibility rate R * r := RR is equal to 1, and Eq. (54) describes a perfect cyclic process illustrated in Fig. 5a. However, including finite-size corrections, from Eq. (52a) we get with the irreversibility parameter ν given by Eq. (53). Now, using the duality of Rayleigh-normal distribution, Eq. (45), and ignoring the terms of order o(1/ √ n) we obtain The error is accumulated during both transformations appearing in Eq. (54). However, since the infidelity δ is not a metric, we cannot simply add the errors. Instead, we note that √ δ is a metric, and so it satisfies the triangle inequality. Thus, the total error , i.e., the infidelity between the final state and the target state ρ ⊗R * r n , satisfies √ ≤ √ 1 + √ 2 . Actually, for 1 + 2 < 1, one can obtain a tighter upper bound [53], Let us now introduce a threshold amount of infidelity, where the equality comes from duality of Rayleighnormal distribution. Note that, if ν = 1, resulting in 0 = 0, then for any finite error one can eventually achieve R * r > 1, and a perfect transformation with R * r = 1 and arbitrarily small error can be achieved. Thus, pairs of states satisfying ν = 1 are reversibly interconvertible up to second-order asymptotic corrections, analogously to a recent result in entanglement theory [39]. The use of such states in thermodynamic transformations is favourable, as it minimises the dissipation of free energy to the environment.
We will now show that the irreversibility parameter ν quantifies the incompatibility of two states (in that transformation from one state to the other leads to irreversibility) also beyond the special ν = 1 case. Consider a process in which one requires that the number of systems n stays constant at all times. In other words, we require R = R = 1, which implies Finite-size irreversibility. (a) In the asymptotic limit, n → ∞, the optimal conversion rate from ρ to σ is equal to the inverse of the conversion rate from σ to ρ. Therefore, reversible cycles can be performed. (b) In general, finite n corrections to conversion rates for near-perfect interconversion are negative, leading to irreversibility with R R < 1. Now, since Z ν (0) ≤ 1/2, with equality achieved only for ν = 0 and ν → ∞, the error rates satisfy Eq. (58) and the total error can be bounded by We present the above bound as a function of ν in Fig. 6a. We see that the closer ν is to 1, the less error will be induced while performing a thermodynamic transformation ρ ⊗n → σ ⊗n → ρ ⊗n or, in other words, the more reversible the process will be.

Distillable work and work of formation gap
One particularly important consequence of irreversibility is the difference between distillable work and work of formation [11]. These quantify the amount of thermodynamically relevant resources that can be distilled from, or are needed to form, a given state. Similarly to the resource theory of entanglement, where Bell states act as standard units of entanglement resource [4], also within the resource theory of thermodynamics there are states acting as "gold standards" for measuring the amount of resources present in a state. These are given by pure energy eigenstates which, having zero entropy, have a clear energetic interpretation. The transformation requiring a change of an ancillary battery state |w , with energy w, into a state |0 , with zero energy, is thus interpreted as performing work w; and a transformation allowing for an opposite change corresponds to extracting work w. Hence, in order to assess the thermodynamic resourcefulness of n copies of a given energy-incoherent state, ρ ⊗n , we will now investigate how much the energy of a pure battery system has to decrease per copy of ρ to construct ρ ⊗n , and how much can it increase per copy of ρ while transforming ρ ⊗n to a thermal state?
More formally, to calculate work of distillation W D we want to find the maximal value w allowing for the ther- The upper bound on the total error accumulated in a cyclic process ρ ⊗n → σ ⊗n → ρ ⊗n as a function of irreversibility parameter ν. (b) Infidelity generated by a heat engine working at the Carnot efficiency during a process that heats up the finite cold bath from Tc to T c as a function of irreversibility parameter ν (that depends on both Tc and T c , as well as on the hot bath temperature through Eq. (53)). The optimal achievable infidelity during the process is plotted in solid line, while the bound on the infidelity generated during a continuous process (when the finite heat bath evolves through thermal states at all intermediate temperatures) is plotted in dashed line. modynamic transformation where the second subsystem is a battery described by a Hamiltonian H B = w |w w|. Similarly, to calculate work of formation W F we want to find the minimal value w allowing for the thermodynamic transformation Using Theorem 10 we can obtain the optimal rate for transformation described by Eq. (62) as a function of w, set it to 1 and solve for w, thus arriving at the approximate expression for the work of distillation: with p denoting the eigenvalues of ρ. One can obtain the expression for the work of formation in an analogous way, this time looking for the optimal rate for transformation given in Eq. (63), resulting in First of all, let us briefly comment on the effect that imperfect transformations (characterised by infidelity ) have on the interpretation of distillable work and work of formation derived above. In the case of distillable work the non-zero infidelity means that the final battery state may differ from the pure state |w w| ⊗n , so one may actually distil less than W D work per particle. Similarly, for the work of formation the final state of the battery may FIG. 7. Distillable work and work of formation gap. The behaviour of distillable work WD and work of formation WF varies in different regimes. In single-shot scenarios they are proportional to min-and max-relative entropies [11]. In the intermediate regime of large but finite n studied in this work, the values of WD and WF lie symmetrically around the value achieved in the asymptotic limit, where WD and WF coincide and are equal to the non-equilibrium generalisation of free energy. Note that the y axis above is in the units of kBT .
differ from the pure state |0 0| ⊗n , so one may actually use more than W F work per particle (by using the purity of the battery). To overcome such problems, one may employ the idea of -deterministic work extraction [54] in the following way. After the distillation process (the argument for the formation process is analogous) one can simply measure the battery in its energy eigenbasis. With probability larger or equal to 1 − the battery state will collapse on |w w| ⊗n (and so n · W D work will be distilled), and with probability the work gain will differ from the derived value. Additionally, one has also to subtract the thermodynamic cost of measurement (erasing memory), proportional to the binary entropy of (note however that this cost is constant and so the cost per particle vanishes as 1/n). Crucially, by choosing to be arbitrarily small, one can approach deterministic work distillation arbitrarily well, i.e., distil n · W D work with probability arbitrarily close to 1.
Secondly, let us note that with the second-order asymptotic correction W D and W F lie symmetrically around the asymptotic value W = k B T · D(p||γ), with Notice that the above correction term is positive for small values of infidelity , so that the resource cost of nearperfect formation of a state is always larger than the amount of resources than can be distilled from it. This symmetric gap that opens for finite n is illustrated in Fig. 7, where we also compare it with the values of W D and W F for the single-shot scenario n = 1 (where W D and W F generally lie asymmetrically around the asymptotic value W ). Furthermore, our second-order correction for distillable work exactly coincides with the one derived in Ref. [54] within an alternative thermodynamic framework, where state transformations are modelled by a sequence of energy level transformations (changes of Hamiltonian eigenvalues interpreted as performing/extracting work) and full thermalisations (replacing a state with the thermal state), rather than by thermal operations. This might have been expected, as the recent result [55] showed that any transformation between energyincoherent states which can be achieved via a thermal operation, can also be achieved by a sequence of level transformations and partial level thermalisations.
Finally, let us analyse the special case when the state under scrutiny is itself a thermal equilibrium state γ , at some temperature T different from the background temperature T . In the asymptotic limit n → ∞, both distillable work W D and work of formation W F coincide with the standard thermodynamic result: the maximal (minimal) amount of work that can be extracted (needs to be invested) while changing the temperature of the system from T to T (from T to T ) is given by its free energy change. However, we also obtain the second-order asymptotic correction to W D and W F , given by where we have used Eq. (29) to relate the relative entropy variance with c T , the heat capacity of the system at temperature T . In order to interpret this correction term, we first note that standard thermodynamic results apply when fluctuations of energy are much smaller than the average energy of the system. Now, to quantify the relative strength of fluctuations, we introduce a fluctuation parameter f as a ratio of the total energy variation and the total energy itself, We then see that the correction term ∆W can be expressed as so that ∆W is directly related to the relative strength of fluctuations f , and disappears when the standard thermodynamic assumption, f = 0, holds. Note that Φ −1 ( ) is negative for < 1/2. Moreover, w is the amount of work performed by an engine operating at Carnot efficiency between two heat baths at temperatures T and T , when the amount of heat equal to E γ flows in to, or out of, the bath at temperature T (the former for T > T , the latter for T > T ).

Corrections to efficiency of heat engines
One of the consequences of studying thermodynamics in the quantum regime is that it may not always be plausible to assume that thermal reservoirs are infinite. Thus, recent studies focus on the effects finite-size baths have on standard thermodynamic results like fluctuation theorems [40], Landauer's principle [41], second [43] and third law of thermodynamics [42]. Here, we will show how our results can be employed to investigate the performance of heat engines in the presence of finite baths [56,57]. This will be achieved by studying the appropriately chosen interconversion problem. As we will discuss systems in equilibrium at different temperatures, we will indicate the (inverse) temperature in the subscript. More precisely, a system at temperature T x (at inverse temperature β x ) will be denoted by γ x , and the corresponding partition function by Z x . Also, note that equilibrium states are diagonal in the energy eigenbasis, so our results are applicable.
We consider an infinite hot bath at temperature T h , and a finite bath composed of n particles at a colder temperature T c < T h (analogous considerations hold for the infinite bath being colder than the finite one). As in the previous subsection, we also include a battery system comprised of n two-level systems, each described by Hamiltonian H B = w |w w|, initially in a zero energy eigenstate |0 0| ⊗n . We now couple the cold bath (which can also be considered as a finite-size working body at temperature T c ) and the battery to the hot bath, allowing us to perform a thermal operation with respect to temperature T h . In particular, we consider the following transformation This transformation can be understood as a result of heat Q in flowing from the hotter background bath into an engine; part of it, Q out , then heats up the finite bath composed of n particles from T c to T c , while the remaining energy is used to perform work n · w on n particles comprising the battery. We schematically present this thermodynamic process in Fig. 8. The heat Q out flowing into the cold bath is given by the change of the finite bath energy, while the optimal amount of performed work W = n · w can be calculated similarly as in the previous subsection (by setting the rate from Eq. (52a) for the transformation given by Eq. (71) to 1 and solving for w), yielding where we have introduced the following shorthand notation and ν = V (γ c ||γ h )/V (γ c ||γ h ). Now, using energy conservation, we can calculate the efficiency of the considered process to be with Q out and W given by Eqs. (72) and (73), respectively.
In order to interpret the obtained expression let us first analyse the limiting case. Ignoring the second-order asymptotic correction (sending n → ∞), the extracted work is just equal to the change of the free energy of the finite bath. In Appendix A we show that this is exactly the amount of work that would be extracted by an engine operating at Carnot efficiency, between an infinite bath at fixed temperature T h and a colder finite bath that heats up during the process from T x = T c to T x = T c . In other words, without the 1/ √ n correction we obtain an integrated Carnot efficiency η int C that arises from an instantaneous Carnot efficiency η C at all times, The relation becomes even more evident when we con-sider the limit ∆T → 0. Then with c Tc denoting the heat capacity of the system at temperature T c , and so Now, the finite-size correction leads to a modified expression for integrated efficiency, given by where we have used Eq. (29) again to relate the relative entropy variance with the heat capacity of the system. We first note that there exists a threshold amount of infidelity 0 , given by Eq. (59), below which the correction term is negative. Since the infidelity between final and target states can be interpreted as performing imperfect work, near-perfect work can be performed only with efficiency strictly smaller than η int C . On the other hand, accepting infidelity ≥ 0 allows one to achieve and even go beyond the integrated efficiency corresponding to instantaneous Carnot efficiency. This is in accordance with a recent result showing that the Carnot efficiency can be surpassed by extracting imperfect work [44].
As in the asymptotic limit, we also want to investigate the instantaneous efficiency, when T c is very close to T c . In particular, we will focus on the quality of performed work when the engine works at instantaneous Carnot efficiency. We thus require that the error ∆ accumulated during an infinitesimal step that changes the temperature by ∆T → 0 is equal to the threshold error 0 . Since then the correction term vanishes, we have η int ≈ η int C and we know that for small ∆T this yields η C . Because the two considered thermal states are close, ν is close to unity, and as such the infidelity of the process can be expanded as where α ≈ 0.0545 can be numerically evaluated. The expansion of ∆ν in terms of ∆T is given by with As discussed in Section IV B 1, it is not infidelity, but its square root that satisfies the triangle inequality. We thus have that the instantaneous rate of accumulating square root infidelity is given by and so one can achieve the instantaneous Carnot efficiency by paying the price of an instantaneous rate of accumulating error. This can be then translated into the bound on the total accumulated error in the following way It is straightforward to show that this upper bound is larger than α log 2 ν, which in turn is larger than Z ν (0). This shows that the error accumulated in a continuous process (with the finite bath continuously passing through all intermediate temperatures) is in general larger than that of an optimal "one-step" process. We illustrate this in Fig. 6b.
Finally, let us comment on a special case when ν = 1. For initial and target states being thermal equilibrium states at distinct temperatures (and different from background temperature T h ), the value of ν depends on the Hamiltonian of the investigated system. If it happens that for a given Hamiltonian there exist T x and T x such that ν = 1, then it is possible to achieve perfect work extraction at integrated Carnot efficiency η int C (T x → T x ). Interestingly, for any Hamiltonian there always exist such pairs of temperatures. To see this note that for both T x = 0 and T x = T h the relative entropy variance vanishes, V (γ x ||γ h ) = 0. Since it is a continuous function of temperature, we get that for any T x in the interval (0, T h ) there exists at least one other temperature T x such that V (γ x ||γ h ) = V (γ x ||γ h ), resulting in ν = 1. This shows that by appropriately choosing the temperatures between which the heat engine operates, one may decrease or even avoid irreversible losses.

V. RESULTS ON APPROXIMATE MAJORISATION
We now proceed to the presentation of a few technical lemmas that may be of independent interest. These concern relations between different notions of approximate majorisation and thermomajorisation introduced in Section III C. We first need the following auxiliary result.
Lemma 11. For fixed probability vectors p and q denote byp any distribution that majorises q, and byq any distribution that is majorised by p. Then the maximum fidelity betweenp and p over all suchp is equal to the maximum fidelity between q andq over all suchq, i.e., The proof of the above lemma is based on the results first derived in Ref. [52] and can be found in Appendix B. Moreover, the proof includes an explicit construction of the statep maximising the left hand side of Eq. (87), so that one can calculate the value of optimal achievable fidelities appearing in Lemma 11. Now we can prove the following crucial result concerning pre-and post-majorisation, i.e., approximate thermomajorisation for β = 0.
Lemma 12. Pre-and post-majorisation are equivalent, i.e., p q if and only if p q.
Proof. First, assume that p q. This means that there existsp such thatp q and δ(p,p) ≤ . By Theorem 3 this implies that there exists a bistochastic Λ 0 such that Λ 0p = q. Letq := Λ 0 p, so that p q. Using the fact that fidelity is non-decreasing under stochastic maps, we then have which means that p q =⇒ p q. Now, assume that p q. This means that there exists q such that p q and δ(q,q) ≤ . Let p := arg max p:p q F (p,p).
By definitionp q and, by Lemma 11, we have so that δ(p,p ) ≤ . Thus p q ⇐= p q.
The next lemma links post-majorisation of embedded vectors with post-thermomajorisation for β = 0.
Lemma 13. Post-majorisation between embedded vectors is equivalent to post-thermomajorisation between the original vectors, i.e., Proof. First, assumep q. This means that there exists a bistochastic matrix Λ 0 such that Λ 0p =q with δ(q,q) ≤ . This, in turn, means that with Λ β = Γ −1 Λ 0 Γ being a Gibbs-preserving matrix and Γ a shorthand notation of the embedding map Γ β . We thus conclude that p β Γ −1q . It remains to show that δ(q, Γ −1q ) ≤ . To achieve this we use the facts that embedding is fidelity-preserving, ΓΓ −1 is bistochastic and fidelity is non-decreasing under stochastic maps, so that Therefore δ(q, Γ −1q ) ≤ , which results in p β q. Now, assume that p β q. This means that there exists a Gibbs-preserving matrix Λ β such that Λ β p =q with δ(q,q) ≤ . Through embedding this is equivalent to the existence of a bistochasticΛ β such that Λ βp =q, resulting inp q . It remains to show that δ(q,q) ≤ . This, however, follows directly from the fact that the embedding map is fidelity-preserving, as δ(q,q) = δ(q,q) ≤ . We thus conclude thatp q.p Relations between approximate pre-and postthermomajorisation relations.
The statement of Lemma 13 can be rephrased as ∃q :p q , δ(q,q) ≤ ⇐⇒ ∃q :p q , δ(q,q) ≤ , so that it can be interpreted as the fact that embedding (denoted by hat) and smoothing (denoted by tilde) commute when applied to the target state. Finally, we present a result that links pre-majorisation of embedded vectors with pre-thermomajorisation for β = 0.
The results of this section are collectively presented in Fig. 9. We also would like to make a couple of remarks. First, using the equivalence betweenp q and Comparison between the second-order approximation R2 and exact thermal interconversion rates R * for small system sizes, converting from ρ = 7 10 |0 0| + 3 10 |1 1| to σ = 8 10 |0 0| + 2 10 |1 1|, with Hamiltonian H = |1 1| and access to a thermal bath at temperature 1/β = 3, as in Figure 4. The circles indicate number of states produced (c.f. Appendix C), and the lines those given by the second-order expansion from Eq. (1). As the exact number of states produced is always an integer, we have also indicated the rounding of the second-order approximation both up and down with error bars. The colours indicate the infidelity tolerance, with = 5 × 10 −2 for red and = 10 −5 for blue. The dotted line indicates the number of produced states predicted by the asymptotic interconversion rate R1. p β q one can, in principle, calculate the optimal fidelity (equivalently: minimal distance δ) between the final and target state under thermodynamic interconversion. More precisely, given initial distribution p and final q, the optimal fidelity F (q,q) amongq that are thermomajorised by p is equal to the optimal fidelity F (q,q) amongq that are majorised byp. This in turn, via Lemma 11, is equal to the optimal fidelity F (p,p) amonĝ p that majoriseq. But such an optimal statep is given by the explicit construction presented in Appendix B, and thus we can directly calculate max q: p βq F (q,q) = F (p,p ). (98) In Appendix C we discuss applying these very concepts to numerically compare our approximations of the optimal conversion rate to the true optimum for small system sizes.
We give examples of such numerics in Figures 4  and 10. Second, we want to point out that Lemmas 12 through 14 still hold if one applies them to the concept of approximate thermomajorisation based on total variation distance, i.e., if one replaces δ(p, q) with 1 2 ||p − q|| in Definition 6. The required modifications of the proofs are rather straightforward (with the exception of Lemma 12 which requires some fiddling), and we discuss them in Appendix D.

VI. PROOFS OF THE MAIN RESULT
We will now present a proof of our main result, Theorem 10. We will do this by first showing a reduction to special case of bistochastic interconversion, which corresponds to infinite temperature. We recall that as we are considering energy-incoherent initial and target states, ρ and σ, we only need to consider their eigenvalues, denoted by p and q, with the embedded versions of these given byp andq. Also note that the embedded thermal stateγ simply corresponds to the uniform state η. We can thus use the equivalence between approximate post-thermomajorisation and embedded majorisation, Lemma 13, to obtain: R * β (n, ; p, q) = R * 0 (n, ;p,q), (99a) * β (n, R; p, q) = * 0 (n, R;p,q).
We will split the infinite-temperature proof into four parts, based on whether the relative entropy variances of the initial and target states are non-zero.
We start with the case where both initial and target states have zero relative entropy variance. We refer to this as flat-to-flat interconversion, since both the initial and target state are flat. Recalling that states with embedded distributions being flat are proportional to the thermal state on their support, we note that this case contains the conversion between sharp energy states as a special case. We will then consider the cases of distillation and formation, in which either the target or initial states are flat respectively. These are so-named because they contain both the distillation of, and formation from, sharp energy states. Finally we will consider the general interconversion problem, in which neither state is flat. We refer to the non-flat distribution case as general, because it in fact implies the three other results by using the limiting behaviours of the Rayleigh-normal distributions given in Eq. (43).

A. Central limit theorem
Before we present our proofs, we first formulate the main mathematical tool needed for such a small deviation analysis: a central limit theorem. Specifically, we want to give tail bounds on i.i.d. product distributions. Considering the standard central limit theorem, one can derive the following tail bound. For completeness we provide the proof of the above known result in Appendix E. We will also rely on an alternate form of central limit theorem.
We now want to convert the above result into the specific form of a bound we will use. For some rate R(n), we define the total initial and target states as We note that generally R depends on n, but will henceforth omit this explicit dependence. We also introduce a quantity analogous to k n (x), that will be crucial in all our proofs. The central limit theorem for these distributions is given by the following result.
Lemma 17 (Central limit theorem for P n and Q n ). If V (q) > 0 and R is bounded away from zero, then Q n has the tail bound Moreover, if we consider a rate of the form for some µ ∈ R, then P n also has a corresponding tail bound with ν given by Eq. (101).
Proof. We start by noticing that Lemma 16 remains true if the product distribution is "smeared out". Specifically, for any flat state f we have where Using this, if we make the substitutions a ⊗n ← q ⊗Rn and f ← η ⊗n , recalling that R is bounded away from zero, we arrive at Applying the same argument to P n gives where K P n (y) := exp (H(P n ) + y √ n) . Note that we did not include the variance in the definition of K P n (as we did in K n ), because we have not assumed V (p) > 0. Indeed, if we interpret Φ(y/V (p)) as cumulative of the zero-mean Dirac distribution for V (p) = 0, then this also holds for V (p) = 0.
We now want to express Eq. (112) as a summation up to K n (x) for some x.
Noticing that R µ = D(p)/D(q) + o(1), our choice of rate R µ can be rearranged to give Therefore, K P n (y) = K n (x) is equivalent to Finally, using the continuity of Φ gives the desired tail bound Recalling that ν is proportional to V (p), we note that all of the above expressions are still well-defined if V (p) = 0, where we understand Φ µ,0 to be the cumulative of the Dirac distribution with mean µ.

B. Bistochastic flat-to-flat
For the case of flat-to-flat conversion we can in fact give an exact, single-shot expression.
Proposition 18 (Bistochastic flat-to-flat). For any initial state p and target state q such that V (p) = V (q) = 0, and infidelity ∈ [0, 1), the optimal interconversion rate is given by Before proving this, we first consider the optimal majorising distribution in the case where all distributions involved are flat.
Lemma 19 (Single-shot flat-to-flat). If we let a and b be distributions such that V (a) = V (b) = 0, then Proof. First, from Lemma 12 we know that Now, the right hand side of the above equation is minimised by a stateã , whose explicit construction (found in Ref. [52]) we present in Appendix B while proving Lemma 11. Using it one finds that Since a and b are flat, in the first case we have which leads to Eq. (117).
We can now apply this to give an exact expression for the optimal rate.
Proof of Proposition 18. Applying Lemma 19 to a = P n and b = Q n we find that * 0 (n, R) vanishes for any R ≤ D(p)/D(q), and for any R ≥ D(p)/D(q). Converting the expression for the optimal infidelity into an expression for the optimal rate, gives Proposition 18 as required.

C. Bistochastic distillation
We now consider distillation, in which the target state is flat.
Proposition 20 (Bistochastic distillation). For any initial state p and target state q such that V (q) = 0, and infidelity ∈ (0, 1), the optimal interconversion rate has the second-order expansion The proof of Proposition 20 is relatively straightforward, and similar to the flat-to-flat case. Indeed, similar to the flat-to-flat case, we will be start with a single-shot result in terms of tail bounds.
Proof. First, from Lemma 12 we know that Now, to find a state minimising the right hand side of the above equation, consider a distributionã such that a b. Since b is flat, this is equivalent to the statement thatã has a support which is no larger than that of b. This condition is clearly necessary; it is sufficient as any distribution with d or fewer non-zero entries majorises the flat distribution over d entries. Using the Schwarz inequality one can then show that the distributionã which contains at most exp H(b) non-zero elements, and is closest to a, is simply the truncated-and-rescaled distribution,ã It is a straightforward calculation to show that the infidelity of such a smoothed state is given by mass of the truncated tail We can now use Lemma 17 to bound this tail, giving a second-order expansion for asymptotic case.
Proof of Proposition 20. Applying Lemma 21 to a = P n and b = Q n we find that * Now consider a rate of the form for some µ ∈ R. We then have and so if we apply the first bound of Lemma 17 (with P n in place of Q n ), we arrive at Reversing the relationship between infidelity and rate, this implies as required.

D. Bistochastic formation
For our final special case, we consider flat initial states.

Proposition 22 (Bistochastic formation).
For any initial state p and target state q such that V (q) = 0, and infidelity ∈ (0, 1), the optimal interconversion rate has the second-order expansion This proof is more involved than distillation, and will involve some of the techniques used in the proof of the general interconversion problem. As with distillation, we will attempt to bound the rate by bounding the optimal infidelity between the total initial state P n and a stateP n that majorises the total target state Q n . We will thus fix our rate R µ (n) to be given by Eq. (107) from Lemma 17, and look for bounds on infidelity be-tweenP n and P n . More precisely, our proof is split into two parts: achieveability (upper bound on optimal error/lower bound on optimal rate) and optimality (lower bound on optimal error/upper bound on optimal rate).

Achieveability
Sketch of construction. The general idea here is to construct a distributionP n which is close to the total initial state P n and majorises total target state Q n . By the equivalence of pre-and post-majorisation, Lemma 12, this will prove that there exists a distributionQ n that is majorised by the total initial state P n and is close to the total target state Q n . We will start by defining two bins (sets) of indices, B and B . We will then construct a scaled distribution S n such that for indices belonging to B it has the same shape as P n (i.e., it is flat), but has as much mass as Q n has over the indices belonging to B . The first property will guarantee that S n lies close to P n , and the second that it lies close to a majorising distributionP n Q n . We will then analyse δ(P n ,P n ), giving an upper bound on the optimal error.
Binning. For some small ζ > 0, define two bins of indices where ∞ is shorthand for the largest index, and µ fixes the value of rate R µ (n). We will consider B as a bin on the indices of P n and B on those of Q n . We denote the complements of these bins asB andB , respectively. Scaled distribution S n . For any j ∈ B define and S n l := 1 − k∈B Q n↓ k for some arbitrary l / ∈ B such that S n is normalised.
By construction the mass of S n on B is equal to that of Q n on B , i.e. j∈B S n j := j∈B Q n↓ k .
We now want to show that this implies the existence of a nearby majorising distributionP n Q n . Majorising distributionP n . Instead of giving an explicit construction ofP n , we instead present an existence proof. Specifically we will leverage the following lemma: Lemma 23. For non-negative vectors a and b such that Proof. Consider a function over indices, f : N → N, and the vectorã given by the follow action of f on b, Clearly such a mapping can only concentrate a distribution, and soã b. Now, amongã of the above form we choose that which is closest to a in l ∞ -norm. Let i be an index at which the l ∞ -norm ofã − a, denoted by ∆, is achieved, We are going to assume that ∆ > b ∞ and show that this would imply thatã cannot be optimal, proving ã − a ∞ ≤ b ∞ by way of contradiction. There are two cases to consider:ã i > a i andã i < a i . We start withã i > a i . Asã i > 0, there must exist some α such that f (α) = i and b α > 0. As a =ã and k a k = kã k , there must exist a k such that a k < a k . Consider changing the map to from f (α) = i to f (α) = k = i. This has the effect of loweringã i by b α and raisingã k by the same amount. Given that b α < ∆ by assumption, this means thatã i − a i changes from ∆ to (0, ∆), andã k − a k changes from [−∆, 0) to (−∆, ∆). As such, |ã i − a i | and |ã j − a j | are now both strictly smaller than ∆. Since all other entries ofã are unchanged, we have reduced the number of indices j such that |ã j − a j | = ∆ by at least one. Similarly for the casẽ a i < a i one can make an analogous argument by changing f (α) = k = i to f (α) = k for some k such that a k > a k . Iterating this we can keep decreasing the number of indices at which the norm is achieved, eventually giving us |ã j − a j | < ∆ for all j, i.e. |ã − a| < ∆. This shows that the original choice ofã was not optimal as assumed, proving ∆ ≤ b ∞ by contradiction.
With the use of the above lemma we can now get the desired majorising distributionP n Q n . Lemma 24. There exists a distributionP n such that P n Q n and P n j − S n j ≤ 1/K n (µ + ζ) for all j ∈ B.
Proof. The idea here is to apply Lemma 23 to the restriction of each distribution to its corresponding bin. Specifically if we take a := S n | B and b := Q n↓ B , then Lemma 23 gives us a vectorã such thatã Q n↓ B and where the final inequality follows from normalisation of Q n . We now define our majorising distribution within bin B asP n B :=ã, so that |P n j − S n j | ≤ 1/K n (µ + ζ) for any j ∈ B as desired; and, once again, in order to normalise the distribution we also defineP n l = 1 − i a i for some arbitrary l / ∈ B. The fact thatã majorises the restriction of Q n↓ to B , together with the sharpness ofP n outside of B, gives Next, we note that majorisation spreads over direct sum, i.e. α 1 β 1 and α 2 β 2 implies α 1 ⊕α 2 β 1 ⊕β 2 . This can be seen by using Theorem 3 (i.e., the equivalence of majorisation relation between two distribution with the existence of a bistochastic map between them), and noticing that bistochasticity is preserved under direct sum. Applying this toP n gives the desired majorisation property: Infidelity. By now we have proven the existence of a majorising distributionP n Q n and bounded its distance from the scaled distribution S n on the restriction to B. The final step involves bounding the infidelity δ(P n , P n ). To achieve this we will first show that the closeness of S n andP n on B, as given in Lemma 24, allows us to bound F (P n , P n ) in terms of S n .
Lemma 25. Asymptotically, the fidelity between the majorising distributionP n and the total initial distribution P n can be bounded as follows Proof. First, we apply Lemma 24 and the fact that √ x − y ≥ √ x − √ y for all x ≥ y ≥ 0 to break the fidelity into the desired expression and an error term We can bound the second error term as j∈B P n↓ j K n (µ + ζ) ≤ |B| K n (µ + ζ) Given that ζ > 0 is a constant and V (q) > 0, we have that K n (µ + ζ/2)/K n (µ + ζ) is decaying exponentially as n → ∞. Taking the limit inferior of Eq. (142) therefore gives the required bound.
Using the above result on fidelity, we can now prove achieveability.

Proof of Proposition 22 (achieveability).
Substituting the definition of S n into Lemma 25 gives lim inf n→∞ F P n , P n ≥ lim inf n→∞ i∈B By applying Lemma 17 we then obtain and therefore lim sup n→∞ δ P n , P n ≤ Φ(µ + ζ).
Due to the equivalence between pre-and postmajorisation, Lemma 12, the above means that there exists a distributionQ n that is majorised by the total initial state P n and such that lim sup n→∞ δ Q n , Q n ≤ Φ(µ + ζ).
As this is true for any ζ > 0 we can take ζ 0 and conclude that the optimal infidelity is upper bounded which implies a corresponding lower bound on the optimal rate as required.

Optimality
We now turn our attention to a corresponding secondorder upper bound on the optimal rate. We will make use of the following lemma from Ref. [20], which upper bounds the fidelity between a flat state and a majorising distribution.
We can now use the above lemma to obtain an optimality bound which matches that given for achieveability.
Proof of Proposition 22 (optimality). Consider any dis-tributionP n Q n . Now chooseã =P n , a = P n and b = Q n . Also, notice that M := K n (µ − ζ) satisfies M ≤ exp H(a). Hence, we can apply Lemma 26 to upper bound the fidelity, where N := |{i|Q n i ≥ 1/K n (µ − ζ)}|. By the standard central limit theorem Lemma 15, The normalisation of Q n gives us that N ≤ K n (µ − ζ), and so we can apply Lemma 17 to obtain Applying these limits to Eq. (151) yields for anyP n Q n . Due to the equivalence between preand post-majorisation (c.f. Lemma 12) the above means that for any distributionQ n that is majorised by the total initial state P n we have Taking ζ 0 this gives a lower bound on the optimal infidelity lim inf which implies a corresponding upper bound on the optimal rate

E. Bistochastic interconversion
Finally we turn to the general case in which neither relative entropy variance is vanishing.
Proposition 27 (Bistochastic interconversion). For any initial state p and target state q such that V (p), V (q) > 0, and infidelity ∈ (0, 1), the optimal interconversion rate has the second-order expansion where ν is given in 101.
The proof is similar to that of formation, so it will also utilise many of the ideas inspired by Ref. [20]. The main complication is that for the general interconversion problem the binning of indices is more elaborate: we now have two sets of bins instead of two individual bins, and we need to introduce a function A which controls the relative placement of these bins. Once again we will break the proof into both achieveability and optimality bounds.

Achieveability
Sketch of proof. As with formation, the idea here will be to give an explicit construction ofP n Q n which is close to P n . Again, due to the equivalence of preand post-majorisation, this will prove that there exists a distributionQ n that is majorised by the total initial state P n and is close to the total target state Q n . We will start by introducing two sets of bins for each distribution. Using these bins, we will once again construct a scaled distribution S n which reflects the fine-grained features of P n (same shape within corresponding bins) and coarsegrained features of Q n (same mass within corresponding bins). We will then show that S n necessarily lies close to a majorising distributionP n . Finally we will analyse the infidelity of this distribution with respect to the total initial distribution P n and, by taking the appropriate limits of the parameters in our construction, prove the desired achieveabilty bound of Proposition 27.
In Section VI D our construction was parameterised by a single slack parameter ζ > 0. Here, we will have three parameters: λ > 0, I ∈ N, and a monotone continuously differentiable function 1 ≥ A ≥ Φ pointwise. The parameter λ will control the width of our bins, I the number of bins, and A the relative placements of the two sets of bins.
Binning. For −I ≤ i < I we define our two sets of bins as where the two sequences are defined by for −I ≤ i ≤ I. We will consider B i as bins on the indices of P n and B i on those of Q n . We note that A ≥ Φ implies y i ≥ x i+1 + λ/I, resulting in B i being gapped away from B i , i.e., all indices belonging to B i are much larger than those belonging to B i . This choice plays a role analogous to that of the slack parameter ζ for the bins in Section VI D. For convenience we also define to be the union of bins, and B and B to be the corresponding complements.
Scaled distribution S n . As in Section VI D, we now define S n in each bin B i to have the shape of P n within B i , but the mass of Q n within B i . As the bins B i are all disjoint, for any j ∈ B there exists a unique −I ≤ i < I such that j ∈ B i . For such indices we define S n as We normalise S n by taking S n l := 1 − j∈B S n j for some arbitrary l / ∈ B. Majorising distributionP n . We now want to prove a result analogous to Lemma 24: the existence of a distribution that simultaneously majorises the total target distribution Q n and is close to S n (within each bin).

Lemma 28 (Existence of a majorising distribution).
There exists a distributionP n such thatP n Q n and for all j ∈ B i and −I ≤ i < I.
Proof. The proof is analogous to that of Lemma 24, with an application of Lemma 23 for each pair S n | Bi and Q n | B i , with −I ≤ i < I. This gives usP n such that for all −I ≤ i < I and j ∈ B i it is close to S n P n j − S n j ≤ max and possesses the majorisation properties Splitting the majorisation across the direct sum, as explained in the proof of Lemma 24, gives us the desired overall majorisatioñ Infidelity. The next step involves bounding the fidelity between the total initial state P n and majorising distributionP n given by the above construction. We will start by bounding the fidelity for a fixed set of parameters A, λ and I.
Lemma 29. For any monotone continuously differentiable function 1 ≥ A ≥ Φ, λ ≥ 0, and I ∈ N there exists a sequence of distributionsP n Q n such that with the prime superscript in A and Φ µ,ν denoting a derivative.
Proof. The first part of the proof is analogous to the proof of Lemma 25. More precisely using Lemma 28 (in place of Lemma 24) and employing the fact that B i is gapped away from B i , we can apply the argument presented there to obtain Inserting the definition of the scaled distribution S n yields lim inf Recalling that Φ(y i ) = A(x i+2 ) and applying Lemma 17 gives lim n→∞ j∈Bi Substituting these into our lower bound on fidelity yields lim inf n→∞ F P n , P n↓ Using the differentiability of A and Φ µ,ν we can express these finite differences as integrals lim inf n→∞ F P n , P n↓ Finally, we apply the Schwarz inequality to arrive at the desired bound Now, by taking the appropriate limits of our parameters A, λ and I, we will get the desired achieveability bound on the optimal infidelity, and therefore also on the optimal rate.
Proof of Proposition 27 (achieveability). By Lemma 29 we know that there exists a family of distributionsP n majorising Q n and such that lim inf n→∞ F P n , P n↓ is lower-bounded by Due to the equivalence between pre-and postmajorisation, Lemma 12, this means that there exists a family of distributionsQ n that is majorised by the total initial state P n and such that their fidelity with the total target state Q n is also lower bounded by the above expression, which implies a lower bound on the asymptotic ideal fidelity As the left hand side is independent of I, λ and A, we can now take the desired limits. Note that the order of limits will be important: first we will take I → ∞, then λ → ∞, followed by a supremum over A.
Firstly, we take the limit inferior I → ∞. As a consequence of the fact that λ is still finite, together with the continuous differentiability of A and Φ µ,ν , we have the point-wise limit Using the compactness of [−2λ, 2λ] we can apply the dominated convergence theorem to move this limit inside the integral, which gives lim inf Secondly, we want to take λ → ∞. The existence of this limit follows from monotone convergence theorem, which we can apply due to the monotonicity of A and Φ µ,ν , together with the boundedness of the continuous fidelity. Taking the limit gives us a bound in terms of the continuous fidelity lim inf Lastly, we want to take a supremum over all continuously differentiable monotone functions 1 ≥ A ≥ Φ, which gives us the Rayleigh-normal distribution lim inf Using the above and the duality property of Rayleighnormal distributions, Eq. (45), we obtain the lower bound on the optimal rate

Optimality
We now proceed to the proof of the optimality of Proposition 27. To this end, we will employ two lemmas originally proved in Ref. [20]. The idea is to start by showing that, after a particular coarse-graining, the fidelity between Φ and Φ µ,ν is close to the optimal fidelity sup A≥Φ F(A , Φ µ,ν ) = 1 − Z ν (µ).
Lemma 30 (Lemma 17 of Ref. [20]). For any ζ > 0, there exist real numbers s ≤ t ≤ t ≤ s such that Φ (x)/Φ µ,ν (x) is strictly monotone decreasing for x ∈ (s, s ) and Moreover, if we define F t,t (·, ·) to be the fidelity of distributions which have been coarse-grained on x ≤ t and x ≥ t , specifically then this coarse-grained fidelity has an upper bound Notice that Φ and Φ µ,ν are exactly the distributions that appear in central limit theorem, Lemma 17. Thus, we would like to relate the fidelity F (P n ,P n ) back to the Rayleigh-normal distribution via this coarse-grained fidelity between Gaussians. To be able to argue this for anyP n Q n , we will first give a sufficient condition for a distribution a to have the highest possible fidelity with respect to a second distribution b among all distributions satisfying a majorisation-like condition.
Lemma 31 (Lemma 15 of Ref. [20]). Let a and b be probability distributions such that a i /b i is strictly decreasing for all i. Then, for any distribution c such that we have with equality if and only if c = a.
We are now ready for the optimality proof.
Proof of Proposition 27 (optimality). To prove optimality we need to show that for anyP n Q n , the infidelity betweenP n and P n can be lower bounded by the Rayleigh-normal distribution. This, through Lemma 12, will yield a lower bound on the infidelity between any final stateQ n (i.e., any distribution majorised by the total initial state P n ) and the total target state Q n . We will start by using the monotonicity of fidelity under coarsegraining to bound the fidelity betweenP n and P n by the fidelity between their coarse-grained versions (with coarse-graining over particularly chosen bins of indices). We will then use Lemma 31 (along with the monotonicity properties of Lemma 30) to bound the fidelity between coarse-grained versions of P n andP n by the fidelity between coarse-grained versions of P n and Q n . Next, by applying Lemma 17, we will show that this coarse-grained fidelity asymptotes to a fidelity between Gaussians. We will conclude by returning to the last part of Lemma 30, which will allow us to give a final bound in terms of the Rayleigh-normal distribution.
Fix ζ > 0 and I ∈ N. Let t, t ∈ R be those given by Lemma 30 and introduce for 0 ≤ i ≤ I. Define a set of bins for 0 ≤ i < I, as well as two end bins We now define the limiting coarse-grained versions of total initial and target distributions for −1 ≤ i ≤ I. We note that the above limits all exist by Lemma 17, specifically We would also like to analogously define a distribution c that would be a coarse-grained version ofP n , but we have no guarantees that the corresponding limits exist. In lieu of this, we will use the vectorial Bolzano-Weirestrass Theorem 5 , which gives that there exists a strictly increasing set of indices {m l } l ⊂ N such that {r n } limits to its limit superior lim l→∞ r m l = lim sup n→∞ r n , and that all the limits exist, for all −1 ≤ i ≤ I. We now want to apply Lemma 31 to bound the fidelity F (c, b) with F (a, b), but first we must show that a i /b i is strictly decreasing. Lemma 30 allows us to relate this ratio at the ends, i = −1 and i = I, to the ratio of Gaussian derivatives, For 0 ≤ i < I we can apply Cauchy's mean value theorem, which gives that there exists some s i ∈ (z i , z i+1 ) the ratio of finite differences is given by a ratio of derivatives Given that {s, s 0 , . . . , s I−1 , s } is a strictly increasing sequence, and that Φ (x)/Φ µ,ν (x) is strictly decreasing on (s, s ) by Lemma 30, we therefore have that a i /b i is strictly decreasing as required. Now that we have shown that a i /b i is strictly decreasing, we can apply Lemma 31. This gives us Expanding this out, we have lim sup n→∞ F (P n , P n ) If we take I → ∞, these finite differences above approach derivatives, and we get a bound in terms of a coarsegrained fidelity lim sup n→∞ F (P n , P n ) Finally, we apply the last part of Lemma 30, which allows us to bound this in terms of the Rayleigh-normal distribution, giving Taking ζ 0, we can once more use the equivalence between pre-and post-majorisation, Lemma 12, to conclude that lim inf Using the above together with the duality property of Rayleigh-normal distributions, Eq.17 (45), we obtain the upper bound on the optimal rate R * 0 (n, ) VII. OUTLOOK In this paper we have derived the exact second-order asymptotics of state interconversion under thermal operations between any two energy-incoherent states. It is then natural to ask whether such a characterisation is also possible for general, not necessarily energyincoherent, states. Due to the fact that thermal operations are time-translation covariant, such that coherence and athermality form independent resources [13,31], it seems unlikely that the current approach can be easily generalised. Instead, one would need to rely on the full power of Gibbs-preserving maps [58,59] that form a superset of the thermal operations. For such maps, we believe that a reasonable conjecture is in fact given by Eqs. (1a) and (1b), with the relative entropy and relative entropy variance replaced by their fully quantum analogues given in Refs. [37] and [21,22], respectively.
We also provided a physical interpretation of our main result by considering several thermodynamic scenarios and explaining how our work can be employed to rigorously address the problem of thermodynamic irreversibility. We derived optimal values of distillable work and work of formation, and related them to the infidelity of these processes. This could potentially be used to clarify the notion of imperfect work [44,54,60], and to construct a comparison platform allowing one to continuously distinguish between work-like and heat-like forms of energy. We also discussed thermodynamic processes with finitesize baths, focusing particularly on the optimal performance of heat engines. We have shown that there are non-trivial conditions under which an engine can operate at Carnot efficiency and extract perfect work. This opens the possibility of engineering finite heat-baths in order to minimise undesirable dissipation of free energy. Moreover, our formalism is general enough to address other interesting problems involving finite-size baths, like fluctuation theorems, Landauers' principle or the third law of thermodynamics [40][41][42].
A number of natural technical extensions to our result suggest themselves. We have used the infidelity as our error measure, and conjecture that Theorem 10 will also hold when is a bound on the total variational distance. Our second-order expansion falls into a larger class of results known as small deviation bounds, in which we consider a fixed error threshold . Two natural extensions are to the regime of large deviations [61], in which a fixed rate is considered, and moderate deviations [62,63], in which the rate approaches its optimum and the error vanishes. Last, but not least, we expect that our treatment of approximate majorisation can be extended to cover other distance measures.
Thus, the work produced by an engine working at maximum allowed efficiency is given by We now want to calculate the total work W extracted by such an optimal engine while the temperature of the finite bath changes from T c to T c , W = T c Tc dW. (A6) By simply integrating by parts we get The second term can be calculated by switching from temperature T x to inverse temperature β x and recalling that the average energy is a negative derivative of log Z x over β x : By noting that the entropy of a thermal equilibrium state is given by we thus have Finally, comparing the above with Eq. (25) we arrive at which is equal to the change of free energy of the finite bath.
Appendix B: Proof of Lemma 11

Preliminaries
In Ref. [52] an explicit construction the solutionp to the following maximisation problem, p = arg max p:p q F (p,p), was given. We will now describe the construction of this optimal distribution, as it is crucial for our proof, the second part which will very closely follow the reasoning presented in Ref. [52]. As explained in Section III C, without loss of generality we will assume that all the distributions are non-increasingly ordered. First, for any distribution a define Note that p q is equivalent to E p k ≤ E q k for all k. Now, for a given p and q the construction ofp is given by the following iterative procedure. Set l 0 = d + 1 and define l j := arg min . (B3) If the minimisation defining l j does not have a unique solution then l j is chosen to be the smallest possible. We will also denote by N an index for which l N = 1. The i-th entry of the optimal vector for i ∈ {l j , . . . , l j−1 − 1} is then given byp It is straightforward to verify thatp is normalised, andp q as the construction guarantees that E p k ≤ E q k for all k. Moreover, the optimal fidelity between p and a distribution that majorises q is given by (B5) The crucial observation in proving the optimality of the above, which we will also need in our proof, is that for all j we have This follows from the definition of l j , r j and the fact that for a, b, c, d > 0 one has (see Ref. [52] for details) (B7)

Proper proof
Proof. We will prove the equality in Eq. (87) by showing that the following two inequalities hold We start with the easier part, Eq. (B8a). It is enough to show that the inequality holds for anyp within the constraints. Let us then take anyp such that it majorises q. Due to Theorem 3 this is equivalent to the existence of a bistochastic matrix B 0 such that B 0p = q. This implies that max q: p q F (q,q) = max where the maximisation on the right hand side is over all bistochastic matrices B. We now observe that where in the last step we used the fact that fidelity obeys data processing inequality. We thus have max q: p q F (q,q) ≥ F (p, p), for anyp majorising q, and so Eq. (B8a) holds. We now proceed to proving Eq. (B8b). It is again enough to show that the inequality holds for anyq within the constraints. Let us then take anyq such that it is majorised by p. We now have where l j and ∆ k2 k1 are defined as in Eqs. (B2)-(B3). We now introduce where the inequality holds for every j because p q. Observing that (B15) We will now show that f (x) achieves its maximum within the positive orthant x j ≥ 0 when x = 0, which will finish the proof. This is because then F (q,q) ≤ F (p,p ) = max p:p q F (p,p) (B16) for anyq majorised by p, which implies Eq. (B8b). First, by direct calculation one can find a matrix M of second derivatives of f (x), i.e., M ij = ∂ 2 f (x) ∂xi∂xj . Then, using Gershgorin circle theorem, one can verify that M is negative definite in the allowed region of x, so that there are no local extrema and the maximal value must be obtained at the boundary. Finally, which means that the maximal value is obtained for x = [0, . . . , 0], which finishes the proof.

Appendix C: Efficient algorithm for calculation interconversion infidelities
The construction given in Appendix B gives a natural algorithm for calculating maxp :p q F (p,p). The runtime of this algorithm is O(d 2 ), where d is the size of the input distributions. We now want to argue that this algorithm can be adapted for states described by p ⊗n , such that the optimal interconversion rates can be numerically calculated in a time which is efficient in n, as is done in Figures 4 and 10.
The key property we will leverage is that whilst distributions such as p ⊗n have an exponential number of entries, they only possess a polynomial number of distinct entries (in this case O(n d−1 )). As majorisation is invariant under permutations, it is only the distinct entries (and their degeneracies) that are relevant to our calculation.
The main step in the algorithm, and the bottleneck giving an exponential run-time, is calculating the pivot indices {l j } j used to constructP . Specifically these take the form (C1) Using the constancy of P ↓ i and Q ↓ i on each of the intervals we can see that, for a fixed s, the function being optimised takes the form for any k ∈ {i s , . . . , i s+1 − 1}. As this function is monotonic as a function of i, we conclude that the indices l j must lie on the edges of these intervals, i.e. {l j } j ⊆ {i s } s .
This means that we can restrict our attention only to the these 'edge indices' without loss of generality, lowering the algorithms run-time down from O(D 2 ) to O(t 2 ). Using the above argument, we now have an efficient algorithm for computing * 0 (n, R; p, q). Utilising the idea of embedding, this also allows us to calculate the thermal variant of this, * β (n, R; p, q). Finally, by sweeping over R, we can use this to calculate R * β (n, R; p, q). Examples of this are shown in Figures 4 and 10.
We have included Python code corresponding to the above algorithm as an ancillary file to our arXiv submission.