Sandwiched R\'enyi Convergence for Quantum Evolutions

We study the speed of convergence of a primitive quantum time evolution towards its fixed point in the distance of sandwiched R\'enyi divergences. For each of these distance measures the convergence is typically exponentially fast and the best exponent is given by a constant (similar to a logarithmic Sobolev constant) depending only on the generator of the time evolution. We establish relations between these constants and the logarithmic Sobolev constants as well as the spectral gap. An important consequence of these relations is the derivation of mixing time bounds for time evolutions directly from logarithmic Sobolev inequalities without relying on notions like lp-regularity. We also derive strong converse bounds for the classical capacity of a quantum time evolution and apply these to obtain bounds on the classical capacity of some examples, including stabilizer Hamiltonians under thermal noise.


Introduction
Consider a quantum system affected by Markovian noise modeled by a quantum dynamical semigroup T t (with time parameter t ∈ R + ) driving every initial state towards a unique full rank state σ. Using the framework of logarithmic Sobolev inequalities as introduced in [1,2] the speed of the convergence towards the fixed point can be studied. Specifically, the α 1 -logarithmic Sobolev constant (see [1,2]) is the optimal exponent α ∈ R + such that the inequality D(T t (ρ) σ) ≤ e −2αt D (ρ σ) (1) holds for the quantum Kullback-Leibler divergence, given by D (ρ σ) = tr [ρ(ln(ρ) − ln(σ))], for all t ∈ R + and all states ρ.
The framework of logarithmic Sobolev constants is closely linked to properties of noncommutative l p -norms, and specifically to hypercontractivity [1,2]. Noncommutative l pnorms also appeared recently in the definition of generalized Rényi divergences (so called "sandwiched Rényi divergences" [3,4]). It is therefore natural to study the relationship between logarithmic Sobolev inequalities and noncommutative l p -norms more closely. The approach used here is to define constants (which we call β p for a parameter p ∈ [1, ∞)), which resemble the logarithmic Sobolev constants, but where the distance measure is a sandwiched Rényi divergence instead of the quantum Kullback-Leibler divergence. More specifically, the constants β p will be the optimal exponents such that inequalities of the form (1) hold for the sandwiched Rényi divergences D p , given by instead of the quantum Kullback-Leibler divergence D.
Our main results are two-fold: • We derive inequalities between the new β p and other quantities such as logarithmic Sobolev constants and the spectral gap of the generator of the time evolution. These inequalities not only reveal basic properties of the β p , but can also be used as a technical tool to strengthen results involving logarithmic Sobolev constants.
• We apply our framework to derive bounds on the mixing time of quantum dynamical semigroups. Using the interplay between the β p and the logarithmic Sobolev constants we show how to derive a mixing time bound with the same scaling as that of the one derived in [2] directly from a logarithmic Sobolev constant. Previously, this was only known under the additional assumption of l p -regularity (see [2]) of the generator or for the α 1 -logarithmic Sobolev constant. It is still an open question whether l p -regularity holds for all primitive generators.
As an additional application of our methods we derive time-dependent strong converse bounds on the classical capacity of a quantum dynamical semigroup. We apply these to some examples of systems under thermal noise. These include stabilizer Hamiltonians, such as the 2D toric code, and a truncated harmonic oscillator. To the best of our knowledge, these are the first bounds available on the classical capacity of these channels. We also apply our bound to depolarizing channels, whose classical capacity is known [5], to benchmark our findings.

Notation and Preliminaries
Throughout this paper M d will denote the space of d × d complex matrices. We will denote by D d the set of d-dimensional quantum states, i.e. positive semi-definite matrices ρ ∈ M d with trace 1. By M + d we denote the set of positive definite matrices and by D + d = M + d ∩ D d the set of full rank states. In [3,4] the following definition of sandwiched quantum Rényi divergences was proposed: Definition 2.1 (Sandwiched p-Rényi divergence). Let ρ, σ ∈ D d . For p ∈ (0, 1) ∪ (1, ∞), the sandwiched p-Rényi divergence is defined as: where ker (σ) is the kernel of σ.
Note that we are using a different normalization than in [3,4], which is more convenient for our purposes. The logarithm in our definition is in base e, while theirs is in base 2. When we write log in later sections we will mean the logarithm in base 2.
The sandwiched Rényi divergences increase monotonically in the parameter p ≥ 1 (see [7,Theorem 7]) and we have for any q ≥ p ≥ 1 and all ρ, σ ∈ D d . Next we state two simple consequences of this ordering, which will be useful later.
Proof. Using (4) for ρ ∈ D d we have Here we used that any quantum state ρ ∈ D d fulfills ρ ≤ 1 d . Clearly, choosing ρ = |v min v min | for an eigenvector |v min ∈ C d corresponding to the eigenvalue σ −1 ∞ of σ −1 achieves equality in the previous bound.
Using (4) together with the well-known Pinsker inequality [8,Theorem 3.1] for the quantum Kullback-Leibler divergence we have for any p ≥ 1 and all ρ, σ ∈ D d . The constant 1 2 has been shown to be optimal in the classical case (see [9]), i.e. restricting to ρ that commute with σ, and is therefore also optimal here.

Noncommutative l p -spaces
In the following σ ∈ D + d will denote a full rank reference state. For p ≥ 1 we define the noncommutative p-norm with respect to σ as We introduce the weighting operator Γ σ : For powers of the weighting operator we set for p ∈ R and X ∈ M d . We define the so called power operator I p,q : M d → M d as for X ∈ M d . It can be verified that for any X ∈ M d . As in the commutative theory, the noncommutative l 2 -space turns out to be a Hilbert space, where the weighted scalar product is given by for X, Y ∈ M d . With the above notions we can express the sandwiched p-Rényi divergence (3) for p > 1 in terms of a noncommutative l p -norm as For a state ρ ∈ D d the positive matrix Γ −1 σ (ρ) ∈ M d is called the relative density of ρ with respect to σ. Note that any X ≥ 0 with X 1,σ = 1 can be written as X = Γ −1 σ (ρ) for some state ρ ∈ D d . We will simply call operators X ≥ 0 that satisfy X 1,σ = 1 relative densities when the reference state is clear.
We refer to [1,2] and references therein for proofs and more details about the concepts introduced in this section.

Quantum dynamical semigroups
A family of quantum channels, i.e. trace-preserving completely positive maps, {T t } t∈R + 0 , T t : M d → M d , parametrized by a non-negative parameter t ∈ R + 0 is called a quantum dynamical semigroup if T 0 = id d (the identity map in d dimensions), T t+s = T t • T s for any s, t ∈ R + 0 and T t depends continuously on t. Any quantum dynamical semigroup can be written as T t = e tL (see [10,11]) for a Liouvillian L : where S * is the adjoint of S with respect to the Hilbert-Schmidt scalar product. We will also deal with tensor powers of semigroups. For a quantum dynamical semigroup {T t } t∈R + with Liouvillian L we denote by L (n) the Liouvillian of the quantum dynamical semigroup In the following we will consider quantum dynamical semigroups having a full rank fixed point σ ∈ D + d , i.e. the Liouvillian generating the semigroup fulfills L(σ) = 0 (implying that e tL (σ) = σ for any time t ∈ R + 0 ). We call a quantum dynamical semigroup (or the Liouvillian generator) primitive if it has a unique full rank fixed point σ. In this case for any initial state ρ ∈ D d we have ρ t = e tL (ρ) → σ as t → ∞ (see [12,Theorem 14]).
The notion of primitivity can also be defined for discrete semigroups of quantum channels. For a quantum channel T : M d → M d we will sometimes consider the discrete semigroup {T n } n∈N . Similar to the continuous case we will call this semigroup (or the channel T ) primitive if there is a unique full rank state σ ∈ D + d with lim n→∞ T n (ρ) = σ for any ρ ∈ D d . We refer to [12] for other characterizations of primitive channels and sufficient conditions for primitivity.
To study the convergence of a primitive semigroup to its fixed point σ we introduce the time evolution of the relative density X t = Γ −1 σ (ρ t ). For any Liouvillian L : to be the generator of the time evolution of the relative density. Indeed it can be checked that for any state ρ ∈ D d and relative density X = Γ −1 σ (ρ). Note that X t 1,σ = X 1,σ = 1 for all t ∈ R + 0 . Clearly the semigroup generated byL is completely positive and unital, but it is not trace-preserving in general. In the special case where the mapL generates the adjoint of the initial semigroup, i.e. the corresponding time evolution in Heisenberg picture. A semigroup fulfilling (12) is called reversible (or said to fulfill detailed balance), and in this case the LiouvillianL is a Hermitian operator w.r.t. the σ-weighted scalar product. We again refer to [2,1] for more details on these topics. For discrete semigroups we similarly setT = Γ −1 σ • T • Γ σ . One important class of semigroups are Davies generators, which describe a system weakly coupled to a thermal bath under an appropriate approximation [13]. Describing them in detail goes beyond the scope of this article and here we will only review their most basic properties. We refer to [14,15,16] for more details.
Suppose that we have a system of dimension d weakly coupled to a thermal bath of dimension d B at inverse inverse temperature β > 0. Consider a Hamiltonian H tot ∈ M d ⊗ M d B of the system and the bath of the form where H ∈ M d is the Hamiltonian of the system, H B ∈ M d B of the bath and describes the interaction between the system and the bath. Here the operators S α and B α are self-adjoint. Let {λ k } k∈ [d] be the spectrum of the Hamiltonian H. We then define the Bohr-frequencies ω i,j to be given by the differences of eigenvalues of H, that is, ω i,j = λ i − λ j for different values of λ. We will drop the indices on ω from now on to avoid cumbersome notation, as is usually done. Moreover, we introduce operators S α (ω) which are the Fourier components of the coupling operators S α and satisfy The canonical form of the Davies generator at inverse temperature β > 0 in the Heisenberg picture, L * β , is then given by Here {X, Y } = XY + Y X is the anticommutator and G α : R → R are the transition rate functions. Their form depends on the choice of the bath model [15]. For our purposes it will be enough to assume that these are functions that satisfy the KMS condition [17], that is, Although this presentation of the Davies generators is admittedly very short, for our purposes it will be enough to note that under some assumptions on the operators S α (ω) [18,19] and on the transition rate functions, the semigroup generated by L β converges to the thermal state e −βH tr(e −βH ) and is reversible [17].
In the examples considered here this will always be the case.

Logarithmic Sobolev inequalities and the spectral gap
To study hypercontractive properties and convergence times of primitive quantum dynamical semigroups the framework of logarithmic Sobolev inequalities has been developed in [1,2]. Here we will briefly introduce this theory. For more details and proofs see [1,2] and the references therein. We define the operator valued relative entropy (for p > 1) of X ∈ M + d as With this we can define the p-relative entropy: Definition 2.2 (p-relative entropy). For any full rank σ ∈ M + d and p > 1 we define the p-relative entropy of X ∈ M + d as where 1 q + 1 p = 1. For p = 1 we can consistently define by taking the limit p → 1.
The p-relative entropy is not a divergence in the information-theoretic sense (e.g. it is not contractive under quantum channels). It was originally introduced to study hypercontractive properties of semigroups in [1], where they also show it is positive for positive operators. There is however a connection to the quantum relative entropy as As a special case of the last equation we have We may also use it to obtain an expression for Ent 2,σ : We also need Dirichlet forms to define logarithmic Sobolev inequalities: denotes the generator of the time evolution of the relative density (cf. (11)). For p = 1 we may take the limit p → 1 and consistently define the 1-Dirichlet form by Formally, by making this choice we introduce the logarithmic Sobolev framework for L (i.e. the generator of the time-evolution of the relative density) instead of L * . While this is a slightly different definition compared to [2], where the Heisenberg picture is used, they are the same for reversible Liouvillians.
In [1] the Dirichlet forms were introduced to study hypercontractive properties of semigroups. As we will see in Theorem 3.1, they appear naturally when we compute the entropy production of the Sandwiched Rényi divergences. From Corollary 3.1 we will be able to infer that the Dirichlet form is positive for positive operators, a fact already proved in [1]. Both the Ent p,σ and the Dirichlet form are intimately related to hypercontractive properties of semigroups, as we have for a relative density X, some constant α > 0 and as shown in [1]. Notice that when working with E L 2 we may always suppose the Liouvillian is reversible without loss of generality. This follows from the fact that We can now introduce the logarithmic Sobolev constants: As Ent 2,σ does not depend on L and, as remarked before, E L 2 is invariant under an additive symmetrization, we may always assume without loss of generality that the Liouvillian is reversible when working with α 2 .
For any X ∈ M + d we can define its variance with respect to σ ∈ D + d as This defines a distance measure to study the convergence of the semigroup. Given a Liouvillian L : M d → M d with fixed point σ ∈ D + d we define its spectral gap as whereL : M d → M d is given by (11). We can always assume the Liouvillian to be reversible when dealing with the spectral gap, as it again depends on E L 2 . The spectral gap can be used to bound the convergence in the variance (see [20]), as for any and so

Convergence rates for sandwiched Rényi divergences
In this section we consider the sandwiched Rényi divergences of a state evolving under a primitive quantum dynamical semigroup and the fixed point of this semigroup. It is clear that these quantities converge to zero as the time-evolved state approaches the fixed point. To study the speed of this convergence we introduce a differential inequality, which can be seen as an analogue of the logarithmic Sobolev inequalities for sandwiched Rényi divergences.

Rényi-entropy production
In [18] the entropy production for the quantum Kullback-Leibler divergence of a Liouvillian was computed. We will now derive a similar expression for the entropy production for the p-Rényi divergences for p > 1.
Using the relative density X = Γ −1 σ (ρ) and (11) this expression can be written as: Proof. Rewriting the p-Rényi divergence in terms of the relative density X = Γ −1 σ (ρ) and the corresponding generatorL By the chain rule Define the curve γ : As the differential of the function X → X p at A ∈ M + d is given by pA p−1 , another application of the chain rule yields It is easy to check that dγ . Inserting this in the above equations and writing it in terms of the power operator (8) we finally obtain Expanding this formula gives (21).
By recognizing the p-Dirichlet form in the previous theorem we get: where we used the relative density X = Γ −1 σ (ρ). As we remarked before, Corollary 3.1 implies that the Dirichlet form is always positive for relative densities. To see this, recall that the divergences contract under quantum channels [7] and therefore we have that d dt D p (e tL (ρ) σ) for λ > 0, this shows that it is positive for all positive operators by properly normalizing X.

Sandwiched Rényi convergence rates
For any p > 1 we introduce the functional κ p : for X ∈ M + d . For p = 1 we may again take the limit p → 1 and obtain κ 1 (X) := lim p→1 κ p (X) = Ent 1,σ (X). Note that κ p is well-defined and non-negative as X p,σ ≥ X 1,σ for p ≥ 1. Strictly speaking the definition also depends on a reference state σ ∈ D + d , which we usually omit as it is always the fixed point of the primitive Liouvillian under consideration.
Note that as a special case we have α 1 (L) = β 1 (L). It should be also emphasized that the supremum in the previous definition goes over any positive definite X ∈ M + d and not only over relative densities. However, it is easy to see that we can equivalently write as replacing X → X/ X 1,σ does not change the value of the quotient E L p (X)/κ p (X). Therefore, to compute β p it is enough to optimize over relative densities (i.e. X > 0 fulfilling X 1,σ = 1). By inserting β p into (26) we have for any ρ ∈ D d and Liouvillian L : M d → M d with full rank fixed point σ ∈ D + d . By integrating this differential inequality we get

Computing β p in simple cases
In general it is not clear how to compute β p and it does not depend on spectral data of L alone. This is not surprising, as the computation of the usual logarithmic Sobolev constants α 2 or α 1 is also challenging and the exact values are only known for few Liouvillians [21,2,22]. In the following we compute β 2 for the depolarizing semigroups. Then Proof. Without loss of generality we can restrict to X > 0 with X 1,σ = 1 in the minimization (28). Observe that the generator of the time evolution of the relative density (see (11)) for the depolarizing Liouvillian iŝ An easy computation yields E Lσ 2 (X) = X 2 2,σ − 1 and so where we used sup which easily follows from Lemma 2.1 by exponentiating both sides of Equation (5) and using the correspondence between relative densities and states.
The exact value of α 2 (L σ ) is open to the best of our knowledge, but in the case of σ = 1 Theorem 24], which is of the same order of magnitude as β 2 for these semigroups.
Computing β p for p = 2 seems not to be straightforward even for depolarizing channels, but for the semigroup depolarizing to the maximally mixed state we can at least provide upper and lower bounds. Theorem 3.4 (β p for the Liouvillian depolarizing to the maximally mixed state). Let Proof. The Dirichlet Form of this Liouvillian for X > 0 with X 1, 1 d = 1 is given by Dividing this expression by κ p (X) we get By the monotonicity of the weighted norms, we have The expression on the right-hand side of (33) is monotone decreasing in X p, 1 d and so the infimum is attained at sup which again easily follows from Lemma 2.1. The upper bound follows from (32) as From the relations between LS constants [2, Proposition 13], it follows that for the LS constants of the depolarizing channels we have The constants β p and α p are therefore of the same order in this case for small p ≥ 2.

Comparison with similar quantities 4.1 Comparison with spectral gap
Here we show how β p , see (27), compares to the spectral gap (18) of a Liouvillian. This is motivated by similar results for logarithmic Sobolev constants, where it was shown [2, Theorem 16] that α 1 (L) ≤ λ(L) for reversible semigroups, a result we recover and generalize here.
Proof. Let (s i ) d i=1 denote the spectrum of σ 1/p and choose a unitary U such that As L is reversible, there is a self-adjoint eigenvector X ∈ M d ofL corresponding to the spectral gap, i.e.L(X) = −λ(L)X. Let 0 > 0 be small enough such that Y = 1 d + X is positive for any | | ≤ 0 . For | | ≤ 0 we use Lemma A.1 of the appendix to show Observe that f p (s i , s j ) > 0 for s i , s j > 0. Moreover, as U † σ 1/2p Xσ 1/2p U is non-zero and self-adjoint we have b ij b ji ≥ 0 for all i, j and this inequality is strict for at least one choice of i, j. Therefore, the terms of second order in in the numerator and denominator of (35) are strictly positive, and we obtain λ(L) as the limit of the quotient as → 0.
A similar argument as the one given in the previous proof shows that all real, nonzero elements of the spectrum ofL are upper bounds to β p without invoking reversibility.
Note that in the case of p = 2 (see the discussion after (16)) we may assume that the Liouvillian is reversible without loss of generality and drop the requirement of reversibility in the previous theorem. Alternatively, we can obtain the same statement directly from a simple functional inequality. In this case we can also give a lower bound on β 2 in terms of the spectral gap.
To prove Theorem 4.2 we need the following Lemma. Proof. For X > 0 dividing both sides of the inequality by X 2 1,σ yields This follows from the elementary inequality x − 1 ≤ x ln(x) for x ≥ 1, where we use the ordering X 2,σ ≥ X 1,σ for any X ∈ M d .
To prove the first inequality of (37) consider the depolarizing Liouvillian L σ (X) = tr(X)σ − X. (18)). Inserting this in the above inequality finishes the proof.

Comparison with logarithmic Sobolev constants
Here we show how β p , see (27), compares to the logarithmic Sobolev constant α p .
We will need the following Lemma.

Lemma 4.2.
For any full rank state σ ∈ D + d , any p > 1 and X ∈ M + d with X 1,σ = 1 we have Proof. The function p → D p (ρ σ) is monotonically increasing [3,7] and differentiable (as the noncommutative l p -norm is differentiable in p [1, Theorem 2.7]). Thus, with where we used the relative density X = Γ −1 σ (ρ). The remaining derivative in the above equation has been computed in [1, Theorem 2.7] and we have with the operator valued entropy S p defined in (14) and 1 p + 1 q = 1. Inserting this expression in the above equation we obtain for any X ∈ M + d with X 1,σ = 1, i.e. for any X = Γ −1 σ (ρ) for some state ρ ∈ D d . Now we get where we used (40).

Proof of Theorem 4.3.
There is nothing to show for p = 1 as α 1 (L) = β 1 (L) and we can assume p > 1. For X ∈ M + d with X 1,σ = 1 we can use Lemma 4.2 and the definition of By the variational definition (27) of β p the claim follows.
Theorem 4.3 will be applied in Section 5 to obtain bounds on the mixing time of a Liouvillian with a positive logarithmic Sobolev constant without invoking any form of l p -regularity (see [2]). As usually a logarithmic Sobolev is implied by a hypercontractive inequality [1], we would like to remark that one can also make a similar statement as that of Theorem 4.3 from a hypercontractive inequality. One can easily show that ||e tL || p(t)→p,σ ≤ 1 (41) for p(t) = (p − 1)e −αpt + 1 implies that β p (L) ≥ αp p .

Mixing times
In this section we will introduce the quantities of interest and prove the building blocks to prove mixing times from the entropy production inequalities of the last sections, distinguishing between continuous and discrete time semigroups. We will mostly focus on β 2 , as this seems to be the most relevant constant for mixing time applications. This is justified by the fact that the underlying Dirichlet form is a quadratic form and the entropy related to it stems from a Hilbert space norm. Moreover, as the same Dirichlet form is also involved in computations of the spectral gap, it could be easier to adapt existing techniques, such as the ones developed in [23,19].
Similarly we define the l 2 mixing time for > 0 as In the continuous case I = R + we will often speak of the mixing times of the Liouvillian generator of a quantum dynamical semigroup which we identify with the mixing times of the semigroup according to the above definition.

Mixing in Continuous Time
It is now straightforward to get mixing times from the previous results.
Proof. From (6) and Lemma 2.1 we have for any ρ ∈ D d . The claim follows after rearranging the terms.
Using Theorem 4.3 we can lower bound β p in terms of the usual logarithmic Sobolev constant α p . Combining this with Theorem 5.1 shows the following Corollary.
(43) By Corollary 5.1 a nonzero logarithmic Sobolev constant always implies a nontrivial mixing time bound. One should say that the same bound was showed in [2] for p = 2, however under additional assumptions (specifically l p -regularity [2]) on the Liouvillian in question. While these assumptions have been shown for certain classes of Liouvillians (including important examples like Davies generators and doubly stochastic Liouvillians [2]) they have not been shown in general. Moreover, the bound in Theorem 5.1 clearly does not depend on p and one could in principle optimize over all β p . However, as the computations in subsection 3.3 already indicate, it does not seem to be feasible to compute or bound β p for p = 2 even in simple cases and one will probably only work with β 2 in applications.
The bound from Corollary 5.1 also has the right scaling properties needed in recent applications of rapid mixing, such as the results in [24,25]. In particular, together with the results in [26], the last Corollary shows that the hypothesis of Theorem 4.2 in [24] is always satisfied for product evolutions and not only for the special classes considered in [2].
One may also use these techniques to get mixing times in the l 2 norms which are stronger than the ones obtained just by considering that β 2 is a lower bound to the spectral gap.
In the following let X t = e tL (X) denote the time evolution of the relative density X.
In the remaining part of the section we will discuss a converse to the previous mixing time bounds, i.e. a lower bound on the logarithmic Sobolev constant in terms of a mixing time. This excludes the possibility of a reversible semigroup with both small β 2 and short mixing time with respect to the l 2 distance. For this we generalize [21, Corollary 3.11] to the noncommutative setting.
Moreover, this inequality is tight.
Proof. We refer to Appendix B for a proof.
As remarked in [21], even the classical result does not hold anymore if we drop the reversibility assumption. Therefore, this assumption is also needed in the noncommutative setting. By considering a completely depolarizing channel it is also easy to see that no such bound can hold in discrete time.
Theorem 5.3 implies that for reversible Liouvillians β 2 and α 2 cannot differ by a large factor. More specifically we have the following corollary.
Proof. We showed the first inequality in Theorem 4.3. The second inequality follows by combining (45) and (44).

Mixing in Discrete Time
In this section we will obtain mixing time bounds and also entropic inequalities for discretetime quantum channels T : M d → M d . We will then use these techniques to derive mixing times for random local channels, which we will define next. These include channels that usually appear in quantum error correction scenarios, such as random Pauli errors on qubits [27,Chapter 10]. They will be based on the following quantity: (47) The definition of β D (T ) can be motivated by the following improved data-processing inequality for the 2-sandwiched Rényi divergence.
One should note that, unlike in Theorem 3.2, the constant β D is not optimal in (48). As an example take Also, β D (T ) > 0 is not a necessary condition for primitivity, as there are primitive quantum channels that are not strict contractions with respect to D 2 . To see this, consider the map T : M 2 → M 2 which acts as follows on Pauli operators: One can check that this is a a primitive quantum channel with T 2 (ρ) = 1 2 for any state ρ ∈ D d . However, T maps the pure state 1 2 (1 + σ z ) to the pure state 1 2 (1 + σ x ), which implies that D 2 does not strictly contract under T . We can now prove the following bound on the discrete mixing time.
Proof. By Theorem 5.4 we have for any ρ ∈ D d . The claim then follows from (6) and Lemma 2.1.
Convergence results for primitive continuous-time semigroups can often be lifted to their tensor powers. In discrete time a similar result holds for the following class of channels: The previous definition can be generalized to the case where not all local channels are identical, i.e. if we have T i : M d → M d acting on the ith system in the expression (49). As long as the local channels are all primitive our results also hold for this more general class of channels. However, for simplicity we will restrict here to the above definition.
for any ρ ∈ D d n and where p min = min p i .
Proof. By Theorem 5.4 it is enough to show that β D (T p ) ≥ qp 2 min . Observe that the Dirichlet form of (T (n) where the map T * iT j acts as T * on the i-th system,T on the j-th and as the identity elsewhere. As T * iT j ≤ id d with respect to ·, · σ ⊗n we have From the comparison inequality E L 2 ≥ p 2 min E L (n) 2 and the assumption β 2 L (n) ≥ q it then follows that β D (Φ) ≥ qp 2 min .
As an application we can bound the entropy production and the mixing time in a system of n qubits affected (uniformly) by random Pauli errors. The time evolution of this system is given by the channel T n : M ⊗n 2 → M ⊗n 2 given by with T (ρ) = tr(ρ) 1 2 .
Theorem 5.7. For T n defined as in equation (51) we have for any ρ ∈ D 2 n .
Proof. From [2] it is known that Proof. This follows directly from the previous theorem and Theorem 5.5.

Strong converse bounds for the classical capacity
When classical information is sent via a quantum channel, the classical capacity is the supremum of transmission rates such that the probability for a decoding error vanishes in the limit of infinite channel uses. In general it is not possible to retrieve the information perfectly when it is sent over a finite number of uses of the channel, and the probability for successful decoding will be smaller than 1. Here we want to derive bounds on this probability for quantum dynamical semigroups. More specifically we are interested in strong converse bounds on the classical capacity. An upper bound on the capacity is called a strong converse bound if whenever a transmission rate exceeds the bound the probability of successful decoding goes to zero in the limit of infinite channel uses. We refer to [27,Chapter 12] for the exact definition of the classical capacity and to [28,29,30,4,31] for more details on strong converses and strong converse bounds.
In [4] the following quantity was used to study strong converses.
We will often refer to a (m, n, p)-coding scheme for classical communication using a quantum channel T . By this we mean a coding-scheme for the transmission of m classical bits via n uses of the channel T for which the probability of successful decoding is p (see again [27,Chapter 12] for an exact definition). The following theorem shown in [4, Section 6] relates the information radius and the probability of successful decoding.
We will now apply the methods developed in the last sections to obtain strong converse bounds on the capacity of quantum dynamical semigroups. Proof. Using Theorem 3.2 and Lemma 2.1 we have Now Theorem 6.1 together with the assumption β p (L (n) ) ≥ c finishes the proof.
Together with Theorem 4.3 the previous theorem shows that a quantum memory can only reliably store classical information for small times when it is subject to noise described by a quantum dynamical semigroup with "large" logarithmic Sobolev constant, as we will see more explicitly later in Section 7. Moreover, we can use the results from [26] to give a universal lower bound to the decay of the capacity in terms of the spectral gap and the fixed point.

Examples of bounds for the classical capacity of Semigroups
We will now apply the estimate on the capacity given by Corollary 6.1 to some examples of semigroups. Here C(T ) will denote the classical capacity of a quantum channel T .

Depolarizing Channels
In [5] it is shown that for L 1 In [30] the strong converse property was established. The semigroup generated by L 1 d is therefore a natural candidate to evaluate the quality of our bounds, as determining its classical capacity can be considered a solved problem. As L 1 d is just the difference of a projection and the identity, it is easy to see that the spectral gap of L 1 d is 1, which gives us the upper bound for d = 2.

Stabilizer Hamiltonians
Estimates on the spectral gap of Davies generators of stabilizer Hamiltonians were obtained in [19]. In the following we will make the same assumptions as in [19] on the coupling of the system to the bath. That is, we assume that the operators S α (see (13)) are given by single qubit Pauli operators σ x , σ y or σ z . For the transition rates G α (ω) we only assume that they satisfy the KMS condition [17], that is, G α (−ω) = G α (ω)e −βω . This condition implies that the semigroup is reversible. Recall that for Davies generators at inverse temperature β > 0, which we will denote by L β , the stationary state is always given by the thermal state e −βH tr(e −βH ) .
We will not discuss stabilizer Hamiltonians and groups and their connection to errorcorrecting codes, but refer to [27,Section 10.5] for more details. Given some stabilizer group S ⊂ P n , where P n is the group generated by the tensor product of n Pauli matrices, with commutative generators S =< P 1 , . . . , P k >, we define the stabilizer Hamiltonian to be given by We then have: Proof. The eigenvalues of each P i are contained in {1, −1}, as they are just tensor products of Pauli matrices. From this we have as H S is just the sum of k terms such that −1 ≤ P i ≤ 1. From (58) it follows that as we have 2 n eigenvalues, including multiplicities. Moreover, it also follows that As σ −1 β ∞ = e βH S ∞ tr e −βH S , the claim follows by putting (59) and (60) together.
In [19,Theorem 15] they show for the spectral gap λ of the Davies generators of stabilizer Hamiltonians at inverse temperature β > 0. Here¯ is the generalized energy barrier, h * is the smallest transition rate and η * the longest path in Pauli space. We refer to [19] for the exact definition of these parameters. It is important to stress that in general η * will scale with the number of qubits, so our estimate on the capacity will not be very good as the number of qubits increases. However, in [19,Theorem 15] they also show the estimate for the special case in which the generalized energy barrier can be evaluated with canonical paths Γ 1 . We again refer to [19] for the exact definition. For these cases the gap does not scale with the dimension and our estimate is much better. Summing up we obtain: C(e tL β ) ≤ (n + 2βk log(e)) e −r(β,n,k)t , with r(β, n, k) = e −2β¯ h * 8η * (2kβ + 5n ln (2) + 11) and r(β, n, k) = e −2β¯ h * 8 (2kβ + 5n ln (2) + 11) in case the generalized energy barrier can be evaluated with canonical paths Γ 1 . Moreover, this is a bound in the strong converse sense.
Proof. The claim follows immediately after inserting the bounds from Lemma 7.1 and [19,Theorem 15,16] into Corollary 6.1.
In [19] one can find more explicit bounds for the parameters¯ , η * and h * for some stabilizer groups. To the best of our knowledge this is the first bound available for the classical capacity of this class of quantum channels. To make the bound in Theorem 7.1 more concrete, we show what we obtain for the 2D toric code.

2D Toric Code
Here we consider the 2D toric code as originally introduced in [33], which is a stabilizer code. We consider only square lattices: We take an N × N lattice with N 2 vertical and (N +1) 2 horizontal edges; associating a qubit to each edge gives a total of n = 2N 2 +2N +1 physical qubits. The stabilizer operators are N (N + 1) plaquette operators (including the "open" plaquettes along the rough boundary) and N (N + 1) vertex operators, all of which are independent. It goes beyond the scope of this article to explain the 2D toric code in detail and we refer to [34,Section 19.4] for a discussion. But from the previous observations we obtain that we have k = 2N (N + 1) generators for the stabilizer group of the 2D toric code on n = 2N 2 + 2N + 1 qubits. We will make the same assumptions on the the Davies generators at inverse temperature β > 0 for the toric code as in [19]. These are discussed in the beginning of Subsection 7.2.
In [35] it was proved that the spectral gap for the Davies generators for the 2D toric code at inverse temperature β satisfies λ ≥ 1 3 e −8β , a result which was reproved in [19] using different techniques. We therefore obtain: Corollary 7.1. Let H be the stabilizer Hamiltonian of the 2D toric code on a N × N lattice and L β be its Davies generator at inverse temperature β > 0. Then the classical capacity C(e tL β ) is bounded by with r(β, N ) = e −8β 6 ((10N 2 + 10N + 5) ln(2) + 4βN (N + 1)) + 66 .
Moreover, this is a bound in the strong converse sense.
Proof. The claim follows immediately from Lemma 7.1 and the spectral gap estimate of [35] for the toric code.
From Figure 2 it becomes evident that we cannot retain information in the 2D toric for long times at small inverse temperatures and that we can get nontrivial estimates even for very high dimensions, as the size of the gap does not scale with the size of the lattice. It is conjectured that if the spectral gap of the Davies generators of a Hamiltonian with local, commuting terms satisfies a lower bound which is independent of the size of the lattice, then the logarithmic Sobolev 2 constant also satisfies such a bound [36]. As the Hamiltonian of the 2D toric code is of this form, proving this conjecture would lead to a bound similar to the one in Corollary 7.1, but with a rate r(β, N ) independent of the size of the lattice. This would of course lead to much better bounds for large lattice sizes.  Suppose that the systems couples to the bath via S = (a + a † ), with

Truncated harmonic oscillator
and the transition rate function G(x) = (1+e −xβ ) −1 . Let σ β = e −βH tr(e −βH ) . As the eigenvalues of e −βH are just a geometric sequence, we have In [23, Section V, Example 1] they show for the spectral gap λ of the Davies generator L β of the truncated harmonic oscillator at inverse temperature β > 0. We will denote the value of the lower bound in Equation (64) by µ(d, β). As we can compute σ −1 β exactly and have a bound on the spectral gap from we can apply Corollary 7.1 to these semigroups.
Note that in this case the bound scales with the dimension. Putting these inequalities together with the bound given in Corollary 6.1 for the capacity, we have for the classical capacity of this semigroup:  In this example we see that, as the estimate available on the gap scales with the dimension, our estimates are not much better than the trivial log(d+1) for high dimensions unless we are looking at large times.

Conclusion and open questions
We have introduced a framework similar to logarithmic Sobolev inequalities to study the convergence of a primitive quantum dynamical semigroup towards its fixed point in the distance measure of sandwiched Rényi divergences. These techniques can be used to obtain mixing time bounds and strong converse bounds on the classical capacity of a quantum dynamical semigroup. Moreover, these results show that a logarithmic Sobolev inequality or hypercontractive inequality always implies a mixing time bound without the assumption of l p -regularity (which is still not known to hold for general Liouvillians [2]). Although we have some structural results concerning the constants β p , some questions remain open. For logarithmic Sobolev inequalities it is known that α 2 ≤ α p for p ≥ 1 under the assumption of l p -regularity (see [2]). It would be interesting to investigate if a result of similar flavor also holds for the β p . In all examples discussed here, β 2 and α 2 are of the same order and it would be interesting to know if this is always the case. The framework of logarithmic Sobolev inequalities has recently been extended to the nonprimitive case [37]. It should be possible to develop a similar theory for the sandwiched Rényi divergences to get rid of regularity assumptions present in their main results, as we did here for the usual logarithmic Sobolev constants.
We restricted our analysis to the sandwiched Rényi divergences, as they can be expressed in terms of relative densities and noncommutative l p -norms. This allowed us to connect the convergence under the sandwiched divergences to the theory of hypercontractivity and to use tools from interpolation theory which were vital to prove estimates on capacities. There are however other noncommutative generalizations of the Rényi diver-gences that are known to contract under quantum channels, such as the one discussed in [38, p. 113]. It would be interesting to explore the entropy production and convergence under semigroups for this and other families of divergences in future work.
In a similar vein, it would be interesting to investigate the entropy production or convergence rate for the range 1 2 < p < 1, as the sandwiched Rényi divergences are known to contract under quantum channels for all p > 1 2 [39]. However, looking closely at the proof of Theorem 3.1, we see that for p < 1 the sandwiched Rényi divergence is only differentiable at t = 0 if the initial state has full rank. The study of the convergence of these divergences for p < 1 therefore requires a different technical approach than that of this work. Finally, it would of course be relevant to obtain bounds on the β p for more examples without relying on the estimate based on the spectral gap, such as Davies generators.
where we used the identity (69) in the last step. The above shows that With the well-known expansion ln(1 + which is (71).

B Interpolation Theorems and Proof of Theorem 5.3
In order to prove Theorem 5.3 we will need the following special case of the Stein-Weiss interpolation theorem [ One important consequence of the Stein-Weiss interpolation theorem is the following interpolation result. We again refer to [42, Theorem 1.1.1] for a proof.
The family of operators T z − E clearly satisfies the assumptions of the Stein-Weiss interpolation theorem. We therefore have Observe that by reversibility of L the map T ia is a unitary operator with respect to ·, · σ . We also have T ia • E = E, as T ia (1 d ) = 1 d . This gives where the last equality follows from X − tr (σX) 1 d 2,σ = min c∈R X − c1 d 2,σ . We therefore have ||T ia − E|| 1−s 2→2,σ ≤ 1.
Furthermore, by the unitarity of T ib we can compute Using duality of the norms and that both T 1 and E are self-adjoint we have using the definition of τ in the last equality. Inserting (76) and (77) into (75) we get as (T s − E) (X) 2 1−s ,σ = (T s − E) (X − E (X)) 2 1−s ,σ . Taking the derivative of (78) with respect to s on both sides at s = 0 we get (79) Rearranging the terms in (79) we obtain Ent 2,σ (|X − E (X) |) ≤ 2τ E (X) + 2Var (X) ln ( ) .
To prove that the inequality is tight, consider the depolarizing Liouvillian L σ (X) = tr (X) σ − X for some full rank σ ∈ D + d . It is easy to see that Var σ e tLσ X = e −t Var σ (X) and so t 2 e −1 = 1 + ln σ −1 ∞ − 1 . Restricting to operators commuting with σ, it follows from [21, Theorem A.1] that .
Thus, for a sequence σ n ∈ D + d converging to a state that is not full rank we have lim n→∞ t 2 e −1 α 2 (L σn ) = 1 2 .