Computable R\'enyi mutual information: Area laws and correlations

The mutual information is a measure of classical and quantum correlations of great interest in quantum information. It is also relevant in quantum many-body physics, by virtue of satisfying an area law for thermal states and bounding all correlation functions. However, calculating it exactly or approximately is often challenging in practice. Here, we consider alternative definitions based on R\'enyi divergences. Their main advantage over their von Neumann counterpart is that they can be expressed as a variational problem whose cost function can be efficiently evaluated for families of states like matrix product operators while preserving all desirable properties of a measure of correlations. In particular, we show that they obey a thermal area law in great generality, and that they upper bound all correlation functions. We also investigate their behavior on certain tensor network states and on classical thermal distributions.


Introduction
One of the most important features of quantum systems is the nature of their correlations, which differ from their classical counterparts, and lie behind the complexity of many-body quantum states. It is known, however, that when these correlations are weak and spatially localized, one can devise efficient methods to classically simulate complex quantum states via, for instance, tensor network methods [1,2]. This occurs at least in gapped ground states in 1D [3,4], Gibbs states of local Hamiltonians [5][6][7], and low-depth quantum circuits [8]. Characterizing and quantifying those correlations is hence a subject of wide interest in fields ranging from quantum computing to condensed matter physics.
The quantum mutual information is perhaps the most widely known measure of correlations in quantum systems. It has a number of desirable properties, such as positivity or being nonincreasing under local operations, and a well-defined information-theoretic meaning [9]. For a bipartite state ρ AB , it is given by where S(ρ) = −Tr[ρ log ρ] is the von Neumann entropy. Importantly, it is known to obey an area law for all quantum Gibbs states [7,10], which implies that the correlations between adjacent subsystems scale only like their mutual boundary and are thus spatially localized. Calculating the mutual information in quantum systems is hence an important task in many physically relevant scenarios. However, this is often impossible via known analytical methods and requires numerically diagonalizing the whole density matrix, and no efficient methods to calculate it for matrix product operators are available. It is thus highly desirable to find measures of correlations that share the appealing information-theoretic properties of I(A : B), but are simpler to compute in practice, for instance, through variational algorithms. This is often done by replacing the entropies in Eq.
(1) with the more general α-Rényi entropies. The resulting quantity can be computed via a variety of numerical and analytical means and has been shown to characterize phenomena such as quantum [11,12] and thermal [13] phase transitions, or the correlations in many-body localization [14]. However, it lacks a number of important properties, which prevent it from being a sensible measure of correlations. In particular, it can be negative [15] and can increase under local operations.
Motivated by this, we here explore alternative definitions of the Rényi mutual information, based on the notion of quantum Rényi divergences [16,17]. These are measures of distinguishability of quantum states, which play a pivotal role in information-theoretic tasks, such as single-shot communication protocols [18,19], channel coding [20][21][22][23][24][25] or hypothesis testing [26,27]. In principle, each of the many variants of quantum Rényi divergences [20,[28][29][30][31][32][33] allows us to define a mutual information as we explain in Appendix A. Here, we focus on two particular cases and explain how to compute them in practice, at least when the input state is represented via tensor networks. We show that they satisfy desirable properties of the mutual information, including an area law for thermal states, which constitutes our main technical result. This area law holds (i) in one dimension, (ii) for high temperatures, (iii) for commuting Hamiltonians, or (iv) in classical states, where in the latter case it does not depend on the temperature. We also I ∞ (A : B) I M α (A : B) Nonnegative Nonincreasing under LO Thermal area laws: One dimension High temperature Commuting Classical β-independent α = 2 PEPDO area law α = 2 Bound on correlations show that, like I(A : B), they bound all correlation functions, and that one of them yields by construction an area law for a broad class of tensor network states. Our results are summarized in Table 1.
The manuscript is structured as follows. In Section 2, we discuss the definitions of Rényi mutual information and their relevant variational expressions. In Section 3, we show the thermal area law and explain the regimes in which it applies. Then, in Section 4 we bound its behavior for certain tensor network states, and in Section 5, we explain that any one of these measures upper bounds all correlation functions. The technical proofs, as well as further details, can be found in the Appendix.

Definitions of Rényi mutual information
We consider a quantum system on a finite lattice Λ ⊂ Z D with local Hilbert spaces C d . For a quantum state ρ, we denote its reduced density matrices on subsystems A, B ⊂ Λ as ρ A and ρ B respectively.
Apart from (1), the quantum mutual information is also given by the following equivalent expressions with the Umegaki relative entropy D(ρ σ) = Tr[ρ log ρ − ρ log σ]. This quantity is nonnegative and cannot increase under local operations on both A and B. These properties follow, respectively, from the positivity of the relative entropy, and its contractivity under CPTP maps, i.e., the data-processing inequality.
To obtain Rényi versions of the mutual information, one can then generalize any one of Eq. (1), (2a) and (2b). However, each of them yields a different definition, which are no longer equivalent.

Rényi entropies
Starting from (1) and replacing the von Neumann entropies in the definition of the mutual information by the Rényi entropies S α (ρ) = (1 − α) −1 log Tr[ρ α ], one obtains For integer values of α, this definition contains only integer powers and traces of the density matrices. This feature makes it easily computable in many physically relevant situations. Analytically, an important example is the replica trick [34,35], which has been used to calculate S α in a conformal field theory for integer values of α (see [36][37][38] for calculations of I α (A : B).
The same method has been used to calculate certain Rényi entropies and trace distances [39,40]). It can also be calculated exactly for free fermions [41]. Numerically, it is efficiently computable when the state ρ is represented by a matrix product density operator (MPDO) [42], or by quantum Monte Carlo methods [43][44][45][46]. However, this definition lacks several of the important properties of I(A : B). For instance, it can be negative in physically relevant situations [15]. We show in appendix B that this can, in fact, happen in a very simple scenario: the thermal state of a classical Ising chain with an external field. To do so, we calculate the mutual information arising from Eq. (3) analytically in the thermodynamic limit using transition matrices and show that for a sufficiently weak antiferromagnetic coupling it takes negative values. Moreover, its negativity implies that it cannot be nonincreasing under local operations in general: tracing out the A system is a local operation that can increase I α (A : B) from a negative value to zero.

Maximal Rényi divergence
A possible strategy to obtain a Rényi mutual information, which inherits the desirable properties of I(A : B), is to invoke one of its equivalent definitions in terms of relative entropy (also known as divergence) and directly extend the latter to the Rényi case. For applications in quantum communication, it is common to generalize Eq. (2b), e.g., see [17]. However, let us here start from Eq. (2a). To do so, we introduce the important maximal Rényi divergence [47]. Given quantum states ρ and σ, it is defined as [31] which, in turn, yields a definition of a Rényi mutual information as . The latter is nonnegative and cannot increase under local operations, where the last property follows from the data-processing inequality of the divergence under CPTP maps E [17], i.e., D ∞ (E(ρ) E(σ)) ≤ D ∞ (ρ σ). This quantity has two important features. The first is that it can be approximated when the arguments are matrix product operators. We can rewrite Eq. (4) as D ∞ (ρ σ) = log inf{λ : inf |ψ ψ| λσ − ρ |ψ ≥ 0}. (5) It is then possible to approximate the braket in Eq. (5) using the DMRG algorithm [48]. While convergence is not guaranteed, it can be checked whether it approaches a limit with increasing bond dimension of |ψ , indicating that the infimum has been well approximated. The minimal λ can then be determined using a binary search. A potential difficulty is that determining whether an MPO is positive is an NP-hard problem [49] and calculating D ∞ involves determining the positivity of the MPO λρ A ⊗ ρ B − ρ. As this problem has additional structure it might still be efficiently solvable in practice. This is supported by its similarty with the usual target of the the DMRG algorithm, which is finding the lowest eigenvalue of a particular MPO H inf |ψ ψ| H |ψ .
This has been employed succesfully in many cases, despite also solving an instance of the NP-hard problem of positivity, as a nonnegative smallest eigenvalue is equivalent to a nonnegative state [50][51][52][53]. In fact, the ground state problem is even QMA-hard [54].
The second important feature is that it upper bounds all Rényi divergences that fulfill the dataprocessing inequality [16]. This is relevant since, as we show below, it follows a thermal area law, which automatically extends to all known divergences.
We briefly comment on the generalization using Eq. (2b) instead. This yields another possible definition of Rényi mutual information, which in fact has an operational interpretation in communication theory as the communication cost of entanglement-assisted one-shot communication protocols [18,19]. However, the additional optimization makes the quantity more difficult to handle both computationally and analytically, and hence less suited for our present purpose. Nevertheless, our area laws also apply to this quantity as it is by definition smaller than I ∞ .

Measured Rényi divergence
We now present another quantity that can be efficiently computed numerically via variational algorithms. For classical states, i.e., probability distributions P and Q, the Rényi divergence is defined as [55] From this, the measured Rényi divergence is defined as [56] with a supremum over all possible POVM measurements M and P .,M the post measurement states, i.e., the respective probability distributions over the measurement outcomes. χ is the set of measurement outcomes, whose size can vary. We denote the corresponding mutual information by (9) Again assuming the states to be given as MPDOs, this is an optimization in which the target function only contains products and traces of MPDOs, which can be efficiently computed for integer values of α > 1 with DMRG-type algorithms. In general, this can be done by optimizing over the matrix product operators with purifications to enforce the positivity constraint.
For α = 2, we give an explicit expression for the optimizer. The positivity constraint can be dropped as there cannot be an optimizer with negative eigenvalues. Using a vectorized notation |i j| → |i |j , we then obtain (see Appendix C.1) A direct calculation of ω from this is inefficient due to the inverse involved, but one can instead determine ω variationally by minimizing 2(σ ⊗ 1 + 1 ⊗ σ) |ω − |ρ 2 2 , which is a quadratic expression in |ω and thus, again, it can be obtained using DMRG-like algorithms. For other values of α the derivative of Eq. (9) is not linear in ω and hence no such simple expression for the optimizer can be given.
As we will see in the following sections, the 2measured Rényi mutual informations also fulfills all of the properties in the following chapters and may therefore be considered to be the most useful out of the measured Rényi mutual informations.

Thermal area laws
In this section we present our main technical results: area laws for the Rényi mutual information in thermal states of local Hamiltonians. In the following, we consider a subset of a lattice A and its complement B. The Hamiltonian is split into three parts as the boundary of A, which is such that there are no terms with support in both A \ ∂A and B. Notice that in 1D, |∂A| ∝ const., independent of the system size.
We first introduce a technical lemma, which allows us to prove area laws in several special cases.
For any Hamiltonian defined as above, the maximal Rényi mutual information of a thermal state fulfills The proof is in Appendix C.2. While the first term of the RHS directly scales with the boundary and corresponds to the result in [10], one still needs to bound the additional second term.
In the case of a commuting Hamiltonian, we have E β = e βH I straightforwardly, and thus we find: For Hamiltonians with non-commuting terms, it is no longer possible to cancel the bulk contributions in E β directly, but the norm can still be bounded in many cases.
In 1D systems, we use a lemma from [57] (section 2.3.1 and Theorem 3.1) based on previous work by Araki [58]. The setting here is a finite bipartite chain A = [−N/2, a], B = [a + 1, N/2] with a cut after some site a.
This bound is no longer linear in β and the interaction strength, but it still proves the following area law, as it is independent of the size of A and B.
if 2βJk < 1 and the same bound holds for E −1 β .
We give the proof in Appendix C.3. This results in our last area law based on Lemma 1: In the setting of the previous lemma and for βJk < 1, we have Both Lemmas 2 and 3, use an imaginary timeordered integral to bound the norm of E β , as which follows from the Dyson series and the triangle inequality. The same bound holds for the inverse op- replaced by H on the right-hand side. The operator in the norm corresponds to an imaginary time evolution of H I , which can be approximated with a Taylor series.
This technique cannot be used to extend the area law to arbitrary dimensions and temperatures because there exists a 2D lattice Hamiltonian such that for sufficiently small temperatures the quantity e xH Ae −xH / A diverges in the thermodynamic limit [63,64]. As pointed out in [64] this bound can be extended to the Bethe lattice, which can be seen as an intermediate case between one and higher dimension.
As already mentioned, the previous theorems also extend to all measured Rényi mutual informations.
Let us now comment on the thermal area law for classical systems. For them, an area law for the mutual information independent of temperature and energy was shown in [10]. The idea behind it is the Markov property of classical thermal states, which reads P (x A |x B ) = P (x A |x ∂B ), where again the boundaries are defined as above such that there is no interaction between A and B \ ∂B and vice-versa. This means that all correlations between A and B are mediated through ∂B and also implies that the correlations cluster at the boundary, in the sense that I(A : B) = I(∂A : ∂B). This leads to a bound that only depends on the dimension of the boundary. This latter equality also holds for the Rényi mutual information defined using Rényi divergences, which allows for the following extension.
Theorem 4 (Classical temperature-independent thermal area law). For a classical system with local dimension d, we have for α ∈ (0, 1) ∪ (1,2]. Note that in the classical case the measured mutual information coincides with the analogous definition from Eq. (7). The proof, which uses the fact that every probability distribution majorizes the flat distribution, is given in Appendix C.4. A challenge in the Rényi case is that for fixed system size the mutual information is no longer bounded in general, but only for α ≤ 2. A simple example for two bits shows that it can be arbitrarily large for α > 2, and by extension for I ∞ (see Appendix C.4), in which case we can only give the temperature-dependent Theorem 1.

Rényi mutual information on tensor network states
Matrix product states and also their higher dimensional analog projected entangled pair states (PEPS) [65], have by construction a small bipartite entanglement entropy, bounded by the logarithm of the bond dimension D times the number of neighboring pairs across the boundary of the bipartition. The same holds for their mutual information, as for pure states it is equal to twice their entanglement entropy.
A natural question is whether this extends to the mutual information of projected entangled pair density operators (PEPDOs), their mixed state analog [2]. In [10], this question was answered positively for the mutual information I (A : B), using the additional assumption that the PEPDO has a local purification. This means that there exists a PEPS with a physical and an ancilla index of equal dimension on every site, whose partial trace over the ancillas equals the PEPDO 1 .
This result can be extended to the measured Rényi mutual information for a limited range of α (see Appendix C.5):

Theorem 5. For a PEPDO with local purification, bond dimension D, and |∂A| the number of bonds between A and B, it holds
for α ∈ (0, 1) ∪ (1, 2]. This can be proven by first noticing that the problem is equivalent to the pure state case if one considers the purification. Then, the trace over the ancillas of the PEPS, which yields the PEPDO, does not increase the mutual information as it is a local operation. The remaining step is to compute the mutual information of a pure state ρ = |ψ ψ| and relate it to an entropy of the subsystems. This is similar to the von Neumann case, where I(A : B) = 2S(ρ A ). The entropy of the marginal is then bounded by the logarithm of the number of Schmidt values of the decomposition into 1 Notice that not all PEPDOs admit such description [66,67].
A and B, which yields the desired bound. We give an analog proof for the Rényi mutual information in Appendix C.5 valid for α ≤ 2. For α > 2 we present a simple counterexample on 2 qubits (D = |∂A| = 2) with arbitrarily large measured Rényi mutual information in the appendix.

Correlation functions
The mutual information quantifies both classical and quantum correlations [9]. Therefore, it seems intuitive that it should also impose a bound on correla- The bound trivially extends to all other quantum Rényi divergences that fulfill the data-processing inequality.
The key technical result is a generalization of the quantum Pinsker's inequality: The proof in Appendix C.6 uses the same argument as for the relative entropy D(ρ σ), where the data-processing inequality is applied to a binary measurement [68]. The known equivalent classical result [69] can then be applied to the measurement outcome. The bound on the correlations Eq. (21) follows using X 1 ≥ Tr[XY ]/ Y , exactly as in [10].

Conclusion
We have given alternative definitions of the mutual information. We have shown that, as a measure of bipartite correlations, they satisfy a number of desirable properties, including area laws for thermal states.
As a main advantage over the von Neumann mutual information, we provide DMRG like algorithms to compute these quantities. This should help to characterize correlations in mixed states in a more rigorous way than with the previously used I α . It would be interesting for future numerical studies to evaluate the performance of this algorithm.
These Rényi mutual information measures differ from the widely used definition of Eq. (3), which has nonetheless found applications in, for instance, analyzing thermal phase transitions [13]. Our results do not rule out the possibility that I ∞ (A : B) has a wildly different behavior (such as a volume law) at thermal criticality, which occurs at low temperatures when D > 1. It would be interesting to study whether singularities in it appear at phase transition points or an area law holds with full generality.
These quantities may be a useful measure of correlations beyond thermal states, such as in dissipative dynamics, quantum quenches, and non-equilibrium steady states. We hope that our results motivate their study in the wider context of quantum many-body systems. An interesting future question is whether the smallness of any of these quantities guarantees an efficient approximation of mixed states via tensor networks, similar to how the Rényi entanglement entropy guarantees MPS approximations in the pure state case [3]. The Rényi entanglement of purification is known to play such a role [70], but is hard to calculate in practice.

A Quantum Rényi divergences
In this section we recall various definitions and facts for the different Rényi divergences, which will be useful for the proofs in the following sections.
The Petz Rényi divergence [28] is perhaps the most direct quantum analogue of the classical family of Rényi divergences, that is, for density operators that do not, in general, commute. It is defined for α ∈ (0, 1) if supp(ρ) ⊆ supp(σ) and +∞ otherwise, where the inverse of σ is understood to be a pseudo-inverse. While this seems to be a straightforward generalization, several other definitions that collapse to the classical definition for the commuting case are possible. We will invoke two main alternatives for the purposes of this work. They are the sandwiched Rényi divergence [20,29], which for α ∈ [1/2, 1) ∪ (1, ∞) takes the form and the geometric Rényi divergence [30], which for α ∈ (0, 1) ∪ (1, ∞) reads This geometric Rényi divergence is crucial in our proof of the thermal area law. These quantities share some important properties as being increasing in α and a data-processing inequality While for the sandwiched and Petz Rényi divergence this data-processing inequality holds for the full range of α given above, for the geometric, this is only known for α ≤ 2. Additionally, we have the inequalities The sandwiched and Petz Rényi divergences converge to the Umegaki relative entropy in the limit α → 1, while the geometric converges to the so-called Belavkin-Staszewski relative entropy [71]. For α → ∞, the sandwiched and geometric Rényi divergences converge to the maximal Rényi divergence This is the main reason for using the geometric Rényi divergence as a tool in many of the following proofs: the statements for D ∞ are then derived by taking the limit α → ∞. For a more complete summary of the properties of Rényi divergences and their proofs, see [17] and the references therein.
In the main text we listed several expressions for the mutual information: We can now introduce Rényi mutual information arising from Eq. (31b): and I α and I α respectively. We note that all upper bounds for I(A : B) are automatically also upper bounds to I and I due to the inequalities (29).
In the quantum information literature the mutual information is commonly defined in a different, inequivalent way: Instead of starting from Eq. (31b), Eq. (31c) is taken, and the Umegaki relative entropy is being replaced by a Rényi divergence: Again, a variety of definitions can be made by choosing the different Rényi divergences introduced above. There are operational interpretations in information theory for these quantities, for example in quantum hypothesis testing [27] or entanglement-assisted single-shot communication protocols [19]. Since they contain an additional optimization, these quantities are more difficult to compute from a practical standpoint, and we do not use them in this paper. However, all upper bounds (such as the area laws) trivially hold for them as well.

B Negative Rényi mutual information of a classical Ising chain
In this section, we present the calculation of the Rényi mutual information based on Rényi entropies for a classical Ising chain in its thermal state and show that it can become negative. We adapt the standard method of transfer matrices used to solve the classical Ising spin chain (see, e.g., [72]). The model consists of N spins z i taking values ±1 and is placed on a ring, i.e., we use periodic boundary conditions. The Hamiltonian is given by with the temperature included in the constants and the addition understood as modulo N . A general technical problem occurring in the calculation of mutual information is the breaking of translation symmetry due to the definition of distinct regions A and B. In order to deal with this problem, we put two subsystems A = 1, · · · , L and B = L + 1, · · · , 2L on a chain of length N > 2L and calculate the mutual information in the limit of first taking N → ∞ and then L → ∞. We define the matrix with λ + > λ − the eigenvalues and |λ + , |λ − the corresponding eigenvectors. Note that the Perron-Frobenius theorem guarantees that λ + is the unique largest eigenvalue (in norm) and also positive [73]. The computational basis is denoted by |±1 . We first calculate the partition function by rewriting it with a matrix power 37) and use this result to calculate probabilities as follows.
where in the above we used translation invariance and cyclicity of the trace. We now generalize this calculation to the probability of a configuration of several spins and apply it to a conditional probability.
In the second equality, the cancellations within the trace explcitly verify the Markov property of the periodic chain. In the limit, the dependency on σ 1 also vanishes, which can be understood from the perspective of correlations decaying with the distance. In the large N limit the latter have to mediate through an infinitely long region of the chain, and hence vanish.
We are now ready to calculate the Rényi entropies where we already used (39) to decompose the joint probability. Again, due to translation invariance, all the conditional probabilities are the same. We define where T α is no longer a stochastic matrix. With the same technique as before, (40) becomes and we obtain the Rényi mutual information To evaluate this expression, we use the diagonalization of T α , which reads for some invertible matrix D. The fact that T α is diagonalizable again results from the Perron-Frobenius theorem, which states that the largest eigenvalue has multiplicity one if the matrix has positive entries. Therefore, another eigenvalue exists, which proves diagonalizability. Additionally t + α , which we define to be the larger real eigenvalue, is also strictly larger in absolute value, which allows us to calculate the limit L → ∞ of the above expression.
Using the eigenvalues and eigenvectors of T α (Eq. (41)) and T (Eq. (36)) one can easily derive an explicit analytic expression for the Rényi mutual information I α (A : B). However, we omit writing the exact expression for simplicity, due to its length. Instead we numerically evaluate the formula with the resulting plots in Figure 1, which show the existence of a negative regime for antiferromagnetic coupling. Additionally, we find that the mutual information is not monotonous in α. As mentioned in the main text, the existence of a negative regime also proves that the mutual information violates the nonincrease under local operations, because tracing out the A system is a local quantum operation that increases I α (A : B) to zero.

C.1 Optimizer for the measured Rényi-divergence with α = 2
For α = 2 the expression from the main text for the measured Rényi divergence becomes with σ a state with full support. From [56], it is known that this expression has an optimizer. Here, we prove the explicit expression for the optimizer. We use a vectorized notation with the mapping |i j| → |i |j We denote the target function by f (ω) = 2Tr [ρω] − Tr σω 2 . We can extend the supremum over all hermitian operators, which does not change its value, because for any positive semidefinite operators ω + , ω − we have f (ω + − ω − ) ≤ f (ω + ). Given the optimizer ω, the linear term of f (ω + δω) in δω must vanish as f is differentiable.
This vanishes for any δω if and only if 2ρ − σω − ωσ = 0. The solution to this linear equation can either be written using the invertible superoperator Φ(ω) = σω + ωσ as (49) or in vectorized notation as in Eq. (10) where also the invertibility of Φ for a σ with full rank becomes apparent.

C.2 Proof of Lemma 1
We restate the Lemma from the main text: For any Hamiltonian defined as in Section 3, the maximal Rényi mutual information of a thermal state fulfills The strategy we will follow in order to achieve an upper bound on D ∞ , is to first prove the following αindependent bound for the geometric divergence, . Then one can take the trivial limit α → ∞ and deduce that the same bound holds for D ∞ We start with the argument of the logarithm in D and assume α > 1 to be an integer: We used cyclicity of the trace, submultiplicativity of the operator norm, and the inequality using Hölder's inequality for positive operators A and B. From (52), we continue with (54) The first norm is already of the desired form. For the second and third norms, we write where we used the cyclicity of the partial trace on B with respect to operators supported only in B. The operator norm is just the inverse of the smallest eigenvalue of the partial trace. Let |ψ A be the corresponding eigenvector on A to this eigenvalue and p i , |φ i an eigensystem of exp(−βH B )/Z B . Then we get by bounding an expectation value of a positive operator by its minimum eigenvalue. Inserting this into (55) we get and combining (54) with (57) yields which completes the proof of (51) for any integer α > 1. The previous result extends to any α > 1 by rounding up to the next integer because of the monotonicity of D α in α. The bound (51) also holds for D α in the range α ∈ (0, 1) ∪ (1, 2] and for D α and α ∈ (0, 1) ∪ (1, ∞) just by the inequalities between these Rényi divergences. Finally, to bound the ratio of partition functions Z/Z A Z B , we repeat a simple proof from [74], Lemma 3.6: Together, (51) and (59) give the desired bound which also directly proves the bound for D ∞ as the right-hand side does not depend on α and the limit α → ∞ can trivially be taken.

C.3 Proof of Lemma 3
We restate the lemma from the main text. It features a thermal Lieb-Robinson bound originally due to Ruelle [75], which has previously appeared in [59][60][61][62]: if 2βJk < 1 and the same bound holds for E −1 β .
As explained in the main text, we start from the expression and get, using the Baker-Campbell-Hausdorff formula, with the adjoint action ad Y (X) = [Y, X], whose powers yield nested commutators. We bound them by Lemma 3 from Ref. [62] ad m The inequality i∈C h i ≤ J|∂A| follows from the definitions. We insert (63) into (62) and evaluate the geometric series which holds if 2xJk < 1. Finally, inserting this into the integral (61) we obtain
In the classical case, all Rényi divergences including the measured one coincide, so we have to show: The proof is split into two steps. First, we show that the mutual information between A and B equals the mutual information between the boundaries and second, we give a dimension-dependent bound for the mutual information. We denote by X A the random variable of configurations on the system A and use A • = A \ ∂A.
We prove the first identity where the second to last line comes from repeating all previous steps on the A side.
For the next step we introduce the short-hand notation p(x A , x B ) = P (X ∂A = x A , X ∂B = x B ) and use the dimension D ∂A∂B of the system supported over ∂A ∪ ∂B to find the following bound: The last inequality uses the Schur concavity of Rényi entropies and holds for α ∈ (0, 1) ∪ (1,2]. At this point, it might be natural to wonder if the above area law for classical thermal states can be extended to the case of α > 2. The following counterexample rules out a temperature-independent area law in this range. We take two bits x A , x B ∈ {0, 1} and choose the probability distribution diag( , 0, 0, 1 − ) for a constant written as a diagonal density matrix. The marginal probabilities are then p(0) = , p(1) = 1 − for both A and B. We find that the term for x A = x B = 0 in the sum in the first line of (69) reads (2−α) and becomes arbitrarily large for sufficiently small . Strictly speaking, this example is no thermal state as it contains zero probabilities, but one can choose a sequence of thermal states for every fixed that converges to our example.
C.5 Rényi mutual information for pure states and proof of Theorem 5 Before proving Theorem 5 we give a technical lemma that might be of interest on its own right.
The following proof resembles a proof from [17], where analogous relations are given for I opt α (A : B) (see (33)), which even show that Theorem 5 can be extended to I opt ∞ (A : B) as they hold for all α. We start with the sandwiched Rényi divergence and use the Schmidt decomposition which also yields a representation of the reduced density matrix Additionally, we define which we can use to rewrite This concludes the proof of (70). We prove the relation for the Petz-Rényi divergence with a similar calculation using the same Schmidt decomposition of |ψ and |η = ρ