Page curves and typical entanglement in linear optics

Bosonic Gaussian states are a special class of quantum states in an infinite dimensional Hilbert space that are relevant to universal continuous-variable quantum computation as well as to near-term quantum sampling tasks such as Gaussian Boson Sampling. In this work, we study entanglement within a set of squeezed modes that have been evolved by a random linear optical unitary. We first derive formulas that are asymptotically exact in the number of modes for the R\'enyi-2 Page curve (the average R\'enyi-2 entropy of a subsystem of a pure bosonic Gaussian state) and the corresponding Page correction (the average information of the subsystem) in certain squeezing regimes. We then prove various results on the typicality of entanglement as measured by the R\'enyi-2 entropy by studying its variance. Using the aforementioned results for the R\'enyi-2 entropy, we upper and lower bound the von Neumann entropy Page curve and prove certain regimes of entanglement typicality as measured by the von Neumann entropy. Our main proofs make use of a symmetry property obeyed by the average and the variance of the entropy that dramatically simplifies the averaging over unitaries. In this light, we propose future research directions where this symmetry might also be exploited. We conclude by discussing potential applications of our results and their generalizations to Gaussian Boson Sampling and to illuminating the relationship between entanglement and computational complexity.


Introduction
In his pioneering paper [1], Page considered a Haarrandom pure state in an -dimensional Hilbert space. He conjectured an exact formula for the average entanglement entropy of the reduced density matrix on an -dimensional subspace, where ∈ [0, 1]. His conjecture was proven soon after [2][3][4]. This average, Joseph T. Iosue: jtiosue@umd.edu which is itself a function of and , is now known as the Page curve. Page also defined the average information of the subsystem as the difference between the maximum of the entropy and the average entropy. This quantity has since come to be known as the Page correction. Since then, the Page curve and correction have found applications in a variety of areas, such as black holes [5][6][7], quantum information theory [8,9], and statistical mechanics [10][11][12][13][14][15][16][17], among others. As a next step, many works considered the typical deviation of the entanglement entropy from its average. In particular, a system is said to exhibit typical entanglement if there is a vanishingly small probability that a random state has entanglement bounded away from the average. This phenomenon was introduced and studied in a variety of systems [18][19][20][21][22][23][24][25][26][27].
Entanglement is a key feature of quantum physics and can be used as a resource to complete various tasks, such as teleportation, key distribution, dense coding, and many others [28][29][30][31][32]. Furthermore, entanglement is a necessary ingredient for quantum advantage, since quantum computations with little entanglement can be simulated efficiently classically [33][34][35]. One can define quantum advantage using the language of complexity as using quantum resources to perform a task that is classically hard but quantumly easy. One such task is sampling from the output probability distribution of a Gaussian Boson Sampling experiment [36][37][38][39][40]. The general relationship between entanglement and complexity is largely unknown, but at least some entanglement is necessary for classical hardness. Similarly, in the setting of Boson Sampling, at least some amount of non-Gaussianity is necessary for sampling hardness [41]. On the other hand, too much entanglement can cause a state to be useless for computation. In particular, the typical (over the Haar measure) finite-dimensional quantum state is too entangled to be useful for computation [23,24]. Thus, studying average and typical entanglement is necessary for learning about the useful part of entanglement and what utility random states have.
Entanglement in infinite-dimensional quantum states generated by bosonic Gaussian inputs and Gaussian operations has found direct application in areas such as quantum sensing [42][43][44][45][46] and quantum communication [47,Ch. 12]. Furthermore, the original motivation for studying the finite-dimensional Page curve was to study the black hole information paradox [1,5]. Since the degrees of freedom around black holes are inherently infinite-dimensional bosonic (e.g. photonic) modes, an infinite-dimensional bosonic Page curve may help to understand black hole information dynamics [47,Ch. 14].
Our contributions.
In this work, we study Page curves and the typicality of entanglement in continuous-variable bosonic Gaussian states. Specifically, we compute average and typical entanglement quantities averaged over passive (energy-conserving) Gaussian unitaries, also called linear optical unitaries, with a fixed initial product state of squeezed vacuum states on modes. Indeed, this setup is exactly that of a Gaussian Boson Sampling experiment, which is of great recent interest due to experimental claims of quantum advantage via Gaussian Boson Sampling [48][49][50].
We describe this setup in more detail in Section 2. Then in Section 3, we begin by studying the regime when all the initial squeezing strengths are equal, and denote this squeezing strength by . We derive an analytic expression for the average Rényi-2 entropy of a subsystem of modes as a function of , , and that is exact asymptotically in for arbitrary values of and . Using this expression, we exactly compute the corresponding Page correction. These results are summarized in Fig. 1. In Section 4, we then study the presence of typical entanglement for various scalings of with . When the distance between the entanglement of a random state and the average entanglement value vanishes additively (resp. multiplicatively), we say that entanglement is strongly (resp. weakly) typical. We prove that entanglement as measured by the Rényi-2 entropy is weakly typical for any and strongly typical whenever ∈ ( ). We further show that entanglement as measured by the von Neumann entropy is weakly typical whenever ∈ ( ). Finally, in Section 5, we generalize our discussion to the regime when the initial squeezing strengths are not necessarily equal. We show that, if a certain conjecture is true, then the Rényi-2 and von Neumann entanglement entropies are both weakly typical whenever ∈ ( ).
Prior work. Refs. [26,51] studied Page curves and entanglement in fermionic Gaussian states. Serafini et al. have considered typical entanglement in bosonic Gaussian states [52,53], where they defined two measures, namely the microcanonical and canonical measures, on the set of all -mode bosonic Gaussian states and averaged with respect to these measures. Roughly, averaging over the microcanoncial measure corresponds to integrating over all bosonic Gaussian states up to a fixed bounded total energy, and averaging over the canonical measure corresponds to integrating over all bosonic Gaussian states with a Boltzmann weight factor decaying with the energy of the state. Fukuda and Koenig generalized these results by studying entanglement averaged over passive Gaussian unitaries with a fixed initial product state of squeezed vacuum states on modes [54]. Indeed, their setup is exactly the one we consider in this work. As noted in Ref. [54], the measure defined by fixing squeezing strengths and then applying a random passive Gaussian unitary generalizes both the microcanonical and canonical measures, as the latter two measures can be expressed as convolutions of the former with certain distributions on the set of all squeezing configurations.
Serafini et al. and Fukuda and Koenig consider entanglement in a subsystem of bosonic modes when ∈ ( ); roughly, Serafini et al. allow ∈ (1) and Fukuda and Koenig allow ∈ ( 1/3 ). To the best of our knowledge, there are currently no results on average entanglement or typical entanglement in the regime when ∈ Ω( 1/3 ). This is what we address in our work. Concerning typical entanglement, we emphasize that our results do not supplant the results of Ref. [54] in general, but rather only in certain regimes. In particular, we primarily consider the situation of equal squeezing strengths, whereas their results pertain to the situation of arbitrary squeezing. Similarly, we primarily consider the Rényi-2 entropy, whereas their results apply to both the Rényi-2 and von Neumann entropies. We summarize our work and that of Ref. [54] on typical entanglement in bosonic Gaussian states in Table 1.
For general quantum states, the von Neumann entropy has certain properties, such as strong subadditivity, that the Rényi-entropies do not. For other such properties, we refer to Ref. [32]. Because of this, the von Neumann entropy is generally considered a better measure of entanglement than the Rényi-entropies. Notably, however, it has been shown that the Rényi-2 entropy is special when restricting to bosonic Gaussian states. For example, it was recently proven that for bosonic Gaussian states, the Rényi-2 entropy also obeys strong subadditivity [55,56]. The Rényi-2 entropy is also equal, up to a constant, to the phase-space Shannon sampling entropy − ∫︀ ( , ) log ( , ) d n d n of the Wigner distribution ( , ) of the state for Gaussian states [55]. Here the position vector and momentum vector parameterize the phase space of oscillator modes. The phase-space Shannon sampling entropy has an operational meaning in terms of sampling via homodyne detections [55,57]. Furthermore, it has been shown that for pure tripartite Gaussian states, the Rényi-2 entropy obeys a strong subadditivity inequality that is stronger than that for the von Neumann entropy [58]. Finally, correlation measures for Gaussian states based on the Rényi-2 entropy have been found that have no counterpart when using the von Neumann entropy [59]. As noted in Ref. [55], the aforementioned results are "planting the seeds for a full Gaussian quantum information theory based on the Rényi-2 entropy." Our work is focused primarily on entanglement in bosonic Gaussian states as measured by the Rényi-2 entropy and can thus be viewed as planting a few more seeds.

Setup
In this work, we consider a linear optical system of modes, where each mode is an independent quantum harmonic oscillator. We will restrict our attention to bosonic Gaussian states. We refer to Refs. [60,54] for background on the theory of Gaussian states, and we provide the necessary details to understand this paper in Appendix A.1.
Consider an -mode mixed state in the Hilbert space of square integrable wavefunctions ℋ = 2 (R) ⊗ . For each ∈ {1, . . . , }, let^andb e the position and momentum quadrature operators on the th mode. For each , define^:=^and is a Gaussian state if there is a > 0 and a Hamiltonian^that is at most quadratic in the quadrature operators such that is the thermal state ∝ e −^. Since^contains only linear and quadratic quadrature terms in^, is fully characterized by its first and second moments, Tr(^) and = 1 2 Tr[ (^^+^^)] − Tr(^) Tr(^). is called the covariance matrix of the state .
A unitary is Gaussian if there exists a Hamilto-nian^that is at most quadratic in the quadrature operators such that = e −i^. One can show that a Gaussian unitary maps Gaussian states to Gaussian states. Any Gaussian state can be generated by acting on the vacuum state with a Gaussian unitary. The set of all Gaussian unitaries is isomorphic to the real symplectic group of 2 ×2 matrices Sp (2 ). By the Euler decomposition theorem of a symplectic matrix, it follows that any pure Gaussian state can be generated by acting on an initial product state of squeezed vacuum modes with a passive (energy conserving) Gaussian unitary. The set of all passive Gaussian unitaries is isomorphic to is the orthogonal group of 2 × 2 matrices and U( ) is the unitary group of × matrices. Sp(2 ) is not compact and thus does not have a finite Haar measure to average over. Physically, this is due to the fact that the Gaussian operation of squeezing can take on unbounded values in R. However, U( ) is compact, and it is hence well-defined to consider uniformly sampling from U( ) according to the finite Haar measure.
We are therefore motivated to consider the following notion of a random pure Gaussian state. Namely, we initialize the th mode to be in a squeezed vacuum state with fixed squeezing parameter ∈ R for each ∈ {1, . . . , }. We then randomly sample a passive Gaussian unitary from U( ) and apply it to the modes. This random state is thus character-ized by a fixed choice of the squeezing parameters and a Haar-random choice of a passive Gaussian unitary . Understanding the properties of such random states is of great interest, particularly because Gaussian Boson Sampling experiments rely on precisely those state preparations. For squeezing parameters for ∈ {1, . . . , }, the total expected number of bosons of the state on the modes is ∑︀ =1 sinh 2 ( ). For simplicity, we will begin by considering the case when all the squeezing parameters are equal; = for each for some fixed . In the case of equal squeezings, the average total boson number per mode is sinh 2 . The general case of unequal squeezings will be discussed in Section 5.
The modes are then partitioned into two groups -one group of = modes for some 0 ≤ ≤ 1, and one group of − = (1− ) modes. We then compute the entropy of the reduced state of the modes, or equivalently, since we are considering pure states, the entropy of the reduced state of the − modes [28]. Let the density matrix of the reduced state be . For the entropy function, we will be primarily focused on the Rényi-2 entropy 2 = − log Tr 2 , although we will also prove various statements on the von Neumann entropy 1 = − Tr log as well. The Rényi-2 entropy takes the elegant form 2 = 1 2 log det = 1 2 Tr log , where is the covariance matrix of [60]. For the Gaussian state generated by the unitary ∈ U( ) acting on the squeezed product state with squeezing strength on each mode, the Rényi-2 entropy of the subsystem of modes is denoted by 2 ( ), and the von Neumann entropy by 1 ( ), where the dependence of 1 ( ) and 2 ( ) on , , and is implicit. We will derive statistical properties of 2 ( ) and 1 ( ) in the asymptotic limit → ∞.
Our main results involve the Rényi-2 entropy, and the following proposition allows us to use many of the results on the Rényi-2 entropy to bound the von Neumann entropy. Furthermore, we will also make use of the maximum Rényi-2 entropy to prove our later results on the Page correction.
The full proof of Proposition 1 is given in Appendix B. A tighter version of Eq. (2) was originally derived in Ref. [58,Eq. 15], but we will only need this weaker version. For completeness, we provide a different proof of Eq. (2). Eq. (1) is perhaps implicit in various results in Refs. [52,53,61,60,62], but we have not found it directly stated anywhere. The lower bound 1 ( ) ≥ 2 ( ) is a general property of the Rényi entropies [28,32], whereas the upper bound holds only for Gaussian states. While trivial upper bounds also exist in the general case, Eq. (2) is tighter. Intuitively one may view this as an extension of the fact that for Gaussian states, the Rényi-2 and von Neumann entropies share many useful properties, such as strong subadditivity and others mentioned in Section 1.

Expectation value
Our first results concern the Rényi-2 Page curve, which is the expectation value of the Rényi-2 entan-glement entropy as a function of the partition size ratio = / and the squeezing strength . Recall that the dependence of 2 ( ) on , , and is implicit. We find an exact formula as an infinite series for the Page curve in the limit → ∞.
be the ℓ th Catalan number, and let 2 1 be the hypergeometric function [63][64][65][66][67]. Then This function is symmetric under ↦ → 1− , and hence the formula holds when is replaced with min( , 1− ). Furthermore, asymptotically in , We plot the analytic Page curve for = 3/4 in Fig. 1(a), and we confirm our results numerically in Fig. 1(b). The proof of Theorem 2, given in Appendix C.1, primarily uses two ingredients. The first ingredient is the asymptotic form of the Weingarten calculus for integrating over the unitary group with respect to the Haar measure [69,70]. From this we get an equation for the Page curve that is initially daunting. The second main ingredient is the fact that the Page curve must be symmetric under ↦ → 1 − since the global state on the modes is pure [28]. Quite miraculously, this fact is enough to simplify the equation for the Page curve and arrive at Theorem 2.
The Page curve derived in Theorem 2 can be written as where ℓ ( ) := − ℓ ( ) and ℓ ( ) is a polynomial of degrees ℓ + 1 through 2ℓ in (given in Eq. (C20) in Appendix C.1). Polynomials ℓ ( ) of this form are uniquely determined by the requirement that ℓ ( ) = ℓ (1 − ), which ensures that the Rényi-2 entropy of a subsystem is equal to that of its complement since we are considering pure states. It is from this requirement that we ultimately derive the Page curve. We show that the resulting ℓ ( ) can be understood as a good approximation to ( ) := min( , 1 − ) from below, which we will call the ℓ th approximation. Indeed, the approximation is especially good near the endpoints = 0 and = 1, where the first ℓ derivatives of ℓ ( ) match those of ( ). As ℓ → ∞, the approximation becomes better and better such that lim ℓ→∞ ℓ ( ) = ( ). This provides an interpretation of the derived form of the Page curve. The strength of the squeezing determines the weight that the Page curve has on the ℓ th approximation to ( ). For small squeezing, only low order approximations contribute, with the most dominant contribution being the parabolic shape 1 ( ) = (1 − ). When the squeezing is increased, there is more contribution from higher order approximations, giving the Page curve more of the triangle shape of ( ). We see a manifestation of this interpretation as Meanwhile, from Eq. (1), the maximal Rényi-2 entropy is max 1 2 ( ) = ( ) log cosh (2 ). As stated, near the endpoints = 0 and = 1, ℓ ( ) is a very good approximation to ( ). Thus, regardless of the squeezing strength, when the subsystem size

=
is small (or when its complement is small), the average entanglement is very close to maximal.
Unfortunately, we are unable to simplify the infinite sum for general in Theorem 2 further. However, the Page curve can be fully simplified at = 1/2which is where the maximum for a fixed occursto log cosh . Indeed, we find that From this and the maximum Rényi-2 entropy from Eq. (1), we also find the exact expression for the Page correction at = 1/2 to be 1 2 log (︀ 1 + tanh 2 )︀ . Let us formally state these results as follows.
and the Page correction is The proof of Corollary 3, given in Appendix C.2, is a straightforward consequence of a simplification of the hypergeometric function 2 1 ( , 1 − ; ; 1/2) in terms of gamma functions due to Bailey's theorem [63][64][65][66][67]. Altogether, we have derived the exact formula [Eqs. (3) and (4)] for the Rényi-2 page curve in the regime of equal squeezers as a series in tanh 2 (2 ) and . In the special case when = 1/2, we simplified the series to obtain an exact value of log cosh . From this, we derived the Page correction, or information of the subsystem, at = 1/2 to be exactly . Furthermore, since the Page curve is concave in while the maximum entropy is linear, this correction is maximized at = 1/2. A summary of the Page curve results thus far is provided in Fig. 1.
We now shift our attention to the constant term in the Page curve. Theorem 2 states that asymptotically in , E ∈U( ) 2 ( ) = ( , ) − ( , ) + (1), and it provides the exact expression for ( , ). We further find the following result for , which is proven in Appendix C.3.
Note that ( , ) ≥ 0 for all and , and therefore E 2 ( ) ≤ ( , ) asymptotically. At = 1/2, this simplifies to ( , 1/2) = 1 4 log cosh (2 ). The proof of Proposition 4 is very similar to the proof of Theorem 2, again crucially using the symmetry ↦ → 1 − to simplify the expression coming from the Weingarten function. However, there is one substantial difficulty in the proof of Proposition 4 that does not occur in that of Theorem 2. In this case, the symmetry ↦ → 1 − is almost enough to fully determine ; however, there are constants that cannot be determined by the symmetry alone. We derive expressions for these constants that are complicated sums involving permutations and Catalan numbers. Using objects that arise in bioinformatics when studying gene orders, called breakpoint graphs [71], Ref. [72] evaluates these sums, hence completing the proof of Proposition 4. These constants are discussed more in Appendix C. 3.
In summary, we have derived an explicit form of the asymptotic Rényi-2 entropy Page curve for all and up to corrections that vanish as → ∞.

Variance and typicality
Next, we shift our attention to the variance of the Rényi-2 entropy so as to make statements about typicality of entanglement; that is, how different a random state's entanglement is from the average. Using the results for the Rényi-2 entropy, we will also be able to prove some weaker results for the von Neumann entropy. The typicality results presented below are summarized in Table 1.
Typicality is of interest because it characterizes the applicability of statistical averages. Indeed, statistical mechanics often relies on quantities being typical so that thermodynamic average quantities, such as average energy and average pressure, can accurately represent their true values. In order to quantify the deviation from average, we consider two measures of deviation corresponding to multiplicative and additive distance. If the multiplicative distance between a quantity and its average vanishes in the thermodynamic limit, then that quantity is called weakly typical. If the additive distance vanishes in this limit, then that quantity is called strongly typical. With this intuition, we now formally define weak and strong typicality following Ref. [27].

Definition 5 (Typicality). Let be a nonnegative random variable on the unitary group U( ), and denote its value at
is called strongly typical if for any constant > 0, If E ∈U( ) ( ) does not decay as → ∞, then strong typicality clearly implies weak typicality. We will be concerned with typicality of 1 and 2 , the von Neumann and Rényi-2 entropy respectively. One can compute the variance of the entropy over the Haar measure and then apply Chebyshev's inequality to obtain typicality results. Therefore, we now focus on the variance. We first find the general form of the asymptotic Rényi-2 variance in the equal squeezing regime and show that it is independent of . Recall that the dependence of 2 ( ) on , , and is implicit.
The proof of this theorem, given in Appendix D, is very similar to the proof of Theorem 2. To prove this theorem, we again crucially use that Var 2 ( ) must be symmetric under ↦ → 1− since the full state on the modes is pure. Interestingly, in contrast to Theorem 2 where we found that the Page curve grows linearly with , the asymptotic variance is independent of . Indeed, Theorem 6 is the bosonic analogue of the result that the variance for fermionic Gaussian states is asymptotically constant [26].
From Theorem 6, we find typicality in certain regimes as an immediate corollary. In particular, since the variance is independent of while the average grows with , the entanglement is always weakly typical. Furthermore, the variance is asymptotically zero if ∈ (1) and/or ∈ (1), and typicality is therefore strong in those regimes. Altogether, Theorem 6 and Chebyshev's inequality lead to the following corollary.

. On the other hand, if and are constant in , then
is a constant independent of (but depends on and ).
In summary, we have shown weak typicality in the Rényi-2 entropy for all and and strong typicality whenever ∈ ( ). On the other hand, if and do not tend to zero with increasing , the variance converges to a constant value independent of . Hence, we cannot use Chebyshev's inequality to prove strong typicality in this case. Notably, an asymptotically constant variance does not necessarily imply an absence of strong typicality either -that is, the probability that the entropy deviates from its average can scale as 1/ 2 , but the entropy can deviate by an amount proportional to thus resulting in a constant variance. We are therefore unable to make a definitive statement about strong typicality in the ∈ Θ(1) and ∈ Ω(1) case.
In the next corollary, we will use these results to address typicality as measured by the von Neumann entropy. Specifically, we will use Proposition 1 to show weak typicality of the von Neumann entropy as long as the subsystem size scales sublinearly with the system size.
Equal  Table 1: A summary of the current status of rigorous results on typical entanglement in Gaussian bosonic systems, where in this figure we assume single-mode squeezing parameters that are independent of . Strong and weak typicality are defined in Definition 5. Note that "weak * " indicates that the result is not fully proven, but depends on Conjecture 10. Where we say "weak", we have not ruled out the possibility that the typicality is also strong. The total number of modes is denoted by , and 0 ≤ ≤ is the number of modes in the subsystem. "Equal squeezing" refers to the case when each mode is initially squeezed with the same strength, whereas "unequal squeezing" refers to the general case when each mode can be squeezed independently. The two leftmost columns come from Corollaries 7 and 8 and Remark 11. The rightmost column all follows from the results of Ref. [54]. Prior to Ref. [54], Refs. [52,53] proved strong typicality in the regime ∈ (1).
We again emphasize that our typicality results thus far, which are summarized in Table 1, only apply to the case when the initial squeezing strength on each mode is the same. On the other hand, Ref. [54] proves strong typicality when ∈ ( 1/3 ) for both the von Neumann and Rényi-2 entropies in the general case when squeezing strengths can be different on different modes. It is for this reason that our results do not supplant those of Ref. [54] in general, but rather only in certain regimes. To the best of our knowledge, our results are the first to address the ∈ Ω( 1/3 ) regime.
Ultimately, we would like to determine in exactly which regimes strong and weak typicality occur and do not occur. Our results thus far almost complete the story for the regime of equal squeezing when typicality is measured with the Rényi-2 entropy, since we have proven strong typicality when ∈ ( ) and weak when ∈ Θ( ); the missing piece is whether or not typicality is strong when ∈ Θ( ). However, the story is even more incomplete for the regime of equal squeezing when typicality is measured with the von Neumann entropy, though we made some progress by proving that typicality is at least weak whenever ∈ ( ). In this regime, the best known result for strong typicality is when ∈ ( 1/3 ) as proven in Ref. [54]. Indeed, this is also currently the best known result in the regime of unequal squeezing. In Section 5, we will use our results on equal squeezing typicality to add to the story for unequal squeezing.

Generalizing to unequal squeezing
So far, we have considered the restricted setting where each mode is initially squeezed with strength = for some ∈ R. We now generalize by allowing the squeezing strengths to be different on each mode. As such, for the remainder of this section, the squeezings will be ( 1 , . . . , ), where each ∈ R, and max and min are defined as max := max | | and min := min | |.
To begin, we focus on the Rényi-2 Page curve in the regime of unequal squeezing. When squeezing is small, we can utilize the equal squeezing Page curve in Theorem 2 to compute the Page curve for unequal squeezing. Recall that the dependence of 2 ( ) on , , and is implicit.
Proof. The Rényi-2 entropy of a Gaussian state with covariance matrix is proportional to log det . log is real analytic and det is analytic in the parameters of the initial covariance matrix 0 . Hence, 2 can be written as a power series in . There exists a passive Gaussian unitary, specifically a product of one-mode phase shifters, that acts on 0 via the transformation → − . Hence, by the translational invariance of the Haar measure -that is, the invariance of the Haar measure under the application of any fixed unitary -the power series must be an expansion in 2 . Furthermore, there exists a passive Gaussian unitary, specifically a product of beamsplitters, that acts on power series must reduce to Theorem 2, which fixes ( , ) to be 2 (1 − ). The next term is ( 4 max ), but what is the dependence? We will show that the dependence is at most linear, proving that the remaining terms in the power series are ( 4 max ).
Recall that 2 ( ) = 1 2 Tr log ( ) where ( ) is the covariance matrix for the state generated by from the initial product squeezed state 0 . Since the log function is concave, Jensen's inequality implies that One particularly interesting application of this corollary is when each ∈ (1/ √ ). In this case, the average total number of bosons in the modes is = ∑︀ sinh 2 ( ) ∈ (1). Thus, when one considers a constant number of bosons in the system as the number of modes is taken to infinity, one finds the Page curve to be 2 (1 − ) .
We now shift our focus to entanglement typicality in the regime of unequal squeezing. The results described in the remainder of this section are summarized in Table 1. Ideally, we would like to make further statements about entanglement in the regime of unequal squeezing by using our previous results on equal squeezing. One way to potentially proceed is to use the equal squeezing results to bound the unequal squeezing quantities. Intuitively, we expect that E 2 ( ) for arbitrary unequal squeezing is upper bounded by E 2 ( )| all = max and lower bounded by E 2 ( )| all = min . In other words, by increasing all of the squeezing strengths until they all are equal, the average entanglement will increase. In this spirit, we make the following conjecture.
∈ R, and ∈ N. Then, for any , Conjecture 10 seems intuitive -by increasing the magnitude of any individual squeezing strength, the number of bosons in the system increases, and therefore it would seem surprising for the average entanglement to decrease. We note that, somewhat counterintuitively, one can find explicit unitaries and squeezing configurations for which ( 2 ) 2 ( ) < 0 (we provide an example in our code repository [68]), and hence the presence of the E is necessary for the conjecture. Despite its intuitiveness, we have been unable to rigorously prove Conjecture 10. The derivative of The difficulty in computing the expectation value over ∈ U( ) arises due to the presence of the inverse ( ) −1 . Nonetheless, under the assumption that Conjecture 10 is true, we can immediately upper and lower bound the Rényi-2 Page curve for an arbitrary squeezing configuration ( 1 , . . . , ) by using Theorem 2 with = max and = min respectively, lim Furthermore, the conjecture implies that From this, we can also make statements on weak typicality for unequal squeezers by bounding the variance.
We can therefore upper bound the variance by upper bounding the first term with Eq. (21) and lower bounding the second term with Eq. (19).
Proof. We can bound the variance of the Rényi-2 entropy as From Theorem 6, the E can be brought inside the parentheses in the first term. Then, the right hand side can be computed using Theorem 2, which gives lim →∞ For the von Neumann entropy, we again make use of Eq. (15), which gives lim →∞ , which goes to zero if ∈ (1).
In summary, we have used our results from Sections 3 and 4 on entanglement in the equal squeezing regime to prove various statements in the unequal squeezing regime. In particular, under the assumption that Conjecture 10 is true, we prove both the Rényi-2 and von Neumann entropies are weakly typical whenever the subsystem size scales as ∈ ( ). The best known result for the presence of strong typicality in the unequal squeezing regime is when ∈ ( 1/3 ) as proven in Ref. [54]. The current status of rigorous results on typical entanglement in Gaussian bosonic systems is summarized in Table 1.

Conclusion
In this work, we studied the average and variance of the Rényi-2 and von Neumann entropies in random bosonic Gaussian systems. We computed the Rényi-2 Page curve and Page correction when all the initial squeezing strengths are equal, and we proved various results on the typicality of the Rényi-2 and von Neumann measures of entanglement. Given that the Rényi-2 entropy is a function of only the purity, it is often tractable to measure experimentally. It would be interesting to compare the analytic formula in Theorem 2 to an experimental Gaussian Boson Sampling device to determine how well it is generating and maintaining bipartite entanglement.
We have identified several open problems that would generalize and expand our results. One such open problem is to prove Conjecture 10, which would allow our results on the Page curve to apply more generally. Perhaps the most important remaining task is to complete Table 1 by proving typicality of entanglement in the remaining regimes, such as the regime of unequal squeezing and the von Neumann entropy.
For the latter, we note two potentially fruitful avenues. The first comes from the formula for the von Neumann entropy of a Gaussian state with covariance matrix given in Ref. [73]. Let := √ det = e 2 , where 2 is the Rényi-2 entropy of . Ref. [73] derives expressions for all the Rényi-entropies, including the von Neumann entropy, as functions of . In our work, we found an expression for E 2 ( ) in the regime of equal squeezers by expanding 2 ( ) in a power series and exactly computing asymptotic expectation values over the unitary Haar measure. To find the Rényi-entropy E ( ), one could similarly attempt to expand in powers of , and therefore compute E ( ) by computing E e 2 ( ) for various values of . A second potential way of computing E 1 ( ) is similar, where one could use the formula where Ω is the 2 × 2 symplectic form given in Eq. (A1) [74,60,75]. The equations resulting from using these methods with the Weingarten calculus may potentially be too difficult to simplify at first glance, as was the case in this work. However, it would be interesting to see if the presence of the ↦ → 1 − symmetry is enough, as it was in this work, to reduce the complicated Weingarten expressions to something much more tractable and simple.
In this paper, we have only considered truly Haarrandom unitaries. One important question concerns how these results translate to the case where one uses random local passive Gaussian gates to generate random unitary circuits of finite depth. Indeed, an interesting open problem is to determine the depth dependence of entanglement, sampling complexity, and gate complexity in linear optical circuits. Sampling complexity refers to the classical complexity of generating samples from the output probability distribution defined by a fixed depth linear optical circuit, and gate complexity refers to the minimum number of nearest-neighbor beamsplitters required to generate the probability distribution. Currently, the precise relationship between entanglement and complexity is largely unknown. Numerical analyses of entanglement dynamics in linear optical circuits have been reported in Refs. [76,77]. Partial analytical work was done in Ref. [78], but only in the regime where one or a small number of modes are not initially vacuum. On the contrary, a typical Gaussian Boson Sampling experiment initially squeezes many or all of the modes. Information on such entanglement growth could yield insights on implementations of Gaussian Boson Sampling experiments as well as the complexity of computing output probabilities from such experiments. On the complexity side, many recent works have studied classical simulation and classical sampling complexity of linear optical circuits in certain regimes of low depth and the phase transition at which the complexity passes from easy to hard [79][80][81][82][83]. Indeed, both entanglement and complexity are expected grow with depth, and further study may reveal that the relationship is even more intimate.
In this work, we have characterized the entanglement properties of Gaussian states such as they arise in Gaussian Boson Sampling. In this setting, we also know that sampling from Fock basis measurements of the Gaussian state is computationally intractable. It remains an exciting question to better understand the role that entanglement plays in this context. An important aspect of this direction is to understand how entanglement and measurement bases interact. After all, some form of non-Gaussianity is crucial to generate complexity in bosonic computations [41]. References 25

A Preliminaries
In this preliminary appendix, we will establish some notation and equations that will be used throughout the rest of the appendices. In particular, in Appendix A.1, we review bosonic Gaussian states and describe our setup. In Appendix A.2, we describe integration over the unitary group with the Weingarten calculus. Finally, in Appendix A.3, we restrict our attention to the case when all initial squeezing strengths are equal and derive a series formula for the Rényi-2 entropy that is used in many of our proofs.

A.1 Bosonic Gaussian states
Here, we describe the setup and fix the notation required for the proofs of our main results. We consider a very similar setup as the one described in Ref. [54] and use much of the same notation as them. For a review of bosonic Gaussian states, we recommend Ref. [60]. Since we are only interested in entanglement properties, the first moments -displacements -of the Gaussian states will be irrelevant, and we will ignore them. We consider a system of bosonic modes. Each mode 1 ≤ ≤ is initially in a squeezed state with squeezing strength ∈ R. Define the diagonal matrix = diag (︀ e 2 1 , . . . , e 2 )︀ . The initial state can be represented by the covariance matrix 0 = ⊕ −1 . Define = 1 2 ( − −1 ) and = 1 2 ( + −1 ). The set of all passive Gaussian unitaries -that is, energy-conserving unitaries -acting on modes is Sp(2 ) ∩ O(2 ) which acts on the covariance matrix by conjugation. Here O(2 ) is the orthogonal group of 2 × 2 matrices, and Sp(2 ) is the real symplectic group of 2 × 2 matrices defined with respect to the symplectic form We evolve the initial state with covariance matrix 0 by a passive Gaussian unitary, which corresponds to a ∈ U( ). The resulting state is˜( ) := ( ) 0 ( † ) = ( ) 0 ( ) . Define the × matrix and the × projector Π as Then let^:= ⊕ andΠ = Π ⊕ Π. The covariance matrix corresponding to the reduced state on the first ≤ modes is ( ) :=^˜( )^. Denote the element-wise complex conjugate of the unitary by¯, and the conjugate transpose by † . By simply doing the matrix multiplication, one finds that Note that ( ) is a covariance matrix on modes and is correspondingly a positive 2 ×2 matrix. Throughout this paper, we define := / . One can derive from Eq. (A6) that the average over the Haar measure is E † = E¯¯ † = Tr I × , whereas all the other terms have expectation value 0 since they do not contain an even number of 's and¯'s. Therefore, The symplectic eigenvalues of ( ) are the positive eigenvalues of iΩ ( ). There are symplectic eigenvalues labeled as for 1 ≤ ≤ . Let the von Neumann entropy of the reduced state be 1 ( ), and the Rényi-2 entropy of the reduced state be 2 ( ).

A.2 Weingarten calculus
Since we are interested in average entanglement, we will be averaging over the unitary group U( ) with respect to the unique unit normalized Haar measure. To do so, we will use the Weingarten calculus [69,70]. For a matrix , let denote the entry in row and column . Then, where denotes the permutation group on elements. Wg is called the Weingarten function. In our proofs, we will need the asymptotic form of the Weingarten function, which is given by where | | denotes the minimum number of transpositions needed to generate the permutation , = (2 )!/ !( + 1)! is the th Catalan number, and is a product of cycles of length | |.

A.3 Series formula for the Rényi-2 entropy
In this subappendix, we Taylor expand 2 ( ) = 1 2 Tr log ( ) and derive a series formula for the Rényi-2 entropy when all the initial squeezing values are equal. Hence, for each 1 ≤ ≤ , we set = . Crucially, the resulting formula is a series in the squeezing strength , and the effect of the unitary is separated from that of the squeezing strength .
We would like to apply the Taylor series for the matrix logarithm, and hence must first consider its convergence. We find that and therefore the Taylor series for log, converges for all | | < := 1 2 log 2. Since ( ) is a positive, real symmetric matrix, this expression is indeed real and nonnegative. To make this work for all ∈ R, we can let ≥ 1 and consider which follows from the fact that the determinant of products is equal to the product of determinants. For any given ∈ R, we can choose large enough such that ⃦ ⃦ 1 ( ) − I ⃦ ⃦ < 1 and therefore the Taylor series for log can be used. We therefore find that for large enough , When all = , simplifies to = sinh(2 )I × and to = cosh(2 )I × , and therefore ( ) simplifies to ( ) = cosh(2 )I 2 ×2 + sinh(2 ) , where With Eq. (A19), Of course, 2 ( ) is independent of the choice of , and hence the dependence has dropped out. The only thing left to compute is Tr ℓ . Since we are now only dealing with traces, we can replace with Π in . This nicely simplifies some formulas, since Π is a square matrix, and indeed a projector, whereas is a rectangular matrix. Henceforth, we will therefore let where = Re(¯ †) and = Im(¯ †). Then, doing the matrix multiplication, We then notice that 2 = It is then also easy to verify that this recurrence relation is solved by = Re( ) and = Im( ). Finally, Tr 2 = 2 Tr .
Using this recurrence, we can do the matrix multiplication 2 +1 = 2 to find that Tr 2 +1 = 0. Hence, we only need to worry about even powers of , giving We then use that Tr 2 = 2 Tr to find that Furthermore, since is Hermitian, Tr = Tr¯. Therefore, we arrive at where in the last step we used the Taylor expansion of log cosh(arctanh ) in the variable = tanh (2 ). Eqs. (A32) and (A33) hint at why equal initial squeezings simplify the problem of studying averaged entanglement properties. Specifically, the contribution from the squeezing strength and the contribution from the unitary are separated. Thus, to determine averaged entanglement properties, we only need to deal with the matrix = Π Π¯¯Π.

B Rényi-2 and von Neumann entropies -Proof of Proposition 1
In this appendix, we prove Proposition 1. We derive the maximum of the Rényi-2 and von Neumann entropies, and the former will be of use later when we derive the Rényi-2 Page correction. Furthermore, we prove that the von Neumann entropy can be bounded by the Rényi-2 entropy. In this way, our results on the Rényi-2 Page curve can be used to bound the von Neumann Page curve. We begin by proving that the maximum of the Rényi-2 entropy is Proof. Recall that = † , where = Π¯ †Π. Therefore, is a nonnegative operator, and Tr ℓ ≥ 0. The proposition then immediately follows from Eq. (A32) if we can show that for every ≤ 1/2, there exists a unitary such that = 0. The case when > 1/2 is taken care of by the fact that the Rényi-2 entropy is symmetric under ↦ → 1 − since the global state on the modes is pure. Therefore, we now assume that ≤ 1/2, and we show that there are unitaries ∈ U( ) such that = 0. Since † is nonnegative, we need to prove that there exists a unitary such that = Π¯ †Π = 0. Hence we must prove that there exists a such that (¯ †) = 0 for all 1 ≤ , ≤ = . Equivalently, we can conjugate the expression, giving (¯ †) = ∑︀ =1 = 0. Therefore, we just need to find a set of = orthonormal vectors = {| 1 ⟩ , . . . | ⟩} in C such that ⟨¯| ⟩ = 0. Let | ⟩ = | ⟩ + i | ⟩ for real vectors | ⟩ and | ⟩. Since is orthonormal, we find that The condition that ⟨¯| ⟩ = 0 implies that Hence, we just need to find vectors | ⟩ ∈ R and vectors | ⟩ ∈ R satisfying This immediately means that condition 3 and 4 are satisfied, because ⟨ | ⟩ = 0 for all and . Furthermore, condition 1 is satisfied because ⟨ | ⟩ + ⟨ | ⟩ = /2 + /2 = . Finally, condition 2 is satisfied because ⟨ | ⟩ − ⟨ | ⟩ = /2 − /2 = 0.
Finally, we prove the bound which was first derived in Ref. [58,Eq. 15].

C.1 Linear term -Proof of Theorem 2
Theorem 2 concerns the expectation value of 2 ( ) over U( ) when all the initial squeezing values are equal. From Eq. (A32), we see that it only remains to compute E Tr ℓ . Writing the matrix multiplication of ℓ = (Π Π¯¯Π) ℓ in terms of the matrix entries of and Π and simplifying, we find that We note that this is a simple result of matrix multiplication. The restriction on the and ′ indices to {1, . . . , } is a result of the fact that Π , = 0 for all > . Applying Eq. (A6), we immediately find that E ∈U( ) Simplifying Eq. (C2) at first seems impossible, but it will turn out that we do not need to. All we need to learn from it is the following lemma.
Lemma C.1. Fix a positive integer ℓ. There exist coefficients (ℓ) for ∈ {ℓ + 1, ℓ + 2, . . . , 2ℓ} such that Proof. The proof will proceed as follows. First, we will prove that Tr ℓ contains a term proportional to and no terms proportional to for any > 1. Therefore, ℓ ( ) is indeed independent of . Next, we will prove that ℓ ( ) has no terms for > 2ℓ and ≤ ℓ. Throughout this proof, we interpret the delta functions in Eq. (C2) as constraints on the summations. Different permutations on the indices result in a different number of constraints and hence terms with different powers of and .
Recall that = † where = Π¯ †Π. Therefore, Tr is equal to the Frobenius norm ‖ ‖ 2 , which is the sum of the square absolute values of the entries of . Thus, by removing the projector Π from (i.e. setting it to I), the trace cannot decrease. It follows that the presence of Π cannot increase the trace Tr ℓ . Getting rid of the Π from and using the cyclic nature of the trace, we see that Tr ℓ ≤ . Furthermore, consider Eq. (C2) with = defined by (1) = 2ℓ, (2) = 1, (3) = 2, . . . , (2ℓ) = 2ℓ − 1. Then, −1 is the identity, Wg( −1 , ) contributes a factor of −2ℓ . With this , the sum over the and ′ yields a factor of 2ℓ . Finally, with this chosen , the sum over will yield ′ Then summing over ′ , we get a single factor of . Hence, the term with the specific permutation described above yields a term of the form 2ℓ −2ℓ = 2ℓ .
We have shown that there is a term proportional to and that there are no terms proportional to 2 , 3 , etc. Since we are working asymptotically in , we can therefore ignore all terms proportional to 1 for every nonnegative . This proves that lim →∞ E ∈U( ) 1 Tr ℓ is independent of and only depends on , which justifies the definition of the function ℓ ( ). The only thing left to show is that ℓ ( ) has only terms ℓ+1 through 2ℓ . So we only need to show that there are no terms for > 2ℓ and ≤ ℓ. We begin with the former. To look at powers of , it is sufficient to look at the powers of . We therefore restrict our attention to the sum over and ′ in Eq. (C2). The sum over 1 , ∑︀ 1=1 , will give either a factor of 1 or a factor of depending on how the index 1 is constrained by the Kronecker delta functions, and similarly for 2 through 2ℓ . In order to get the highest power of , we require the fewest constraints on and ′ (i.e. the fewest distinct Kronecker deltas). Hence, we require to be the permutation satisfying This permutation is (1) = 2ℓ, (2) = 1, (3) = 2, . . . , (2ℓ) = 2ℓ − 1. With this , we see that a sum over and ′ will give a factor of 2ℓ . Hence, 2ℓ is the highest power of that can be achieved.
Next we need to show that ℓ + 1 is the lowest power of that can be achieved. The sum over and ′ can give at most a factor of ℓ . This is because the first line of delta functions, 1, 2 ′ 2ℓ , reduces the sum over 2ℓ indices and 2ℓ indices ′ down to just a sum over ℓ indices and ℓ indices ′ . The second line of delta functions, 1 , ′ (1) . . . 2ℓ , ′ (2ℓ) , cannot be made equivalent to the first line by any choice of ; in fact, the second line imposes all new constraints. Therefore, this line further reduces the sum over ℓ indices and ℓ indices ′ down to just a sum over ℓ indices (or ℓ indices ′ , but not both). Hence, the highest power of that we get from the summations over and ′ is ℓ . Putting this together with the fact that asymptotically Wg( , ) is at most −2ℓ , we find that any term coming from Eq. (C2) is at most −ℓ × (dependence on ).
Therefore, any powers of that are less than ℓ + 1 can be ignored; if the sum over and ′ yields a term that is for some ≤ ℓ, then that term will be constant or decreasing with . But from above, we already have terms that are proportional to resulting from Eq. (C2), and so terms that are constant or decreasing can be ignored.
As alluded to in the proof of the lemma, the asymptotic form of E Tr ℓ will be a term linear in times a function of , plus a term constant in times a function of , plus terms that decay to zero asymptotically with . Hence, E 2 ( ) = ( , ) − ( , ) + (1). In this section, we are interested only in the linear term ( , ) because we are computing E 2 ( )/ . However, in the next section, we prove Proposition 4 which provides the form of ( , ).
Actually computing (ℓ) from Eq. (C2) seems challenging. However, there is a nice workaround that uses what we know about the Rényi-2 entropy being symmetric under ↦ → 1 − , as the total state on the modes is pure.
Lemma C.2. Let (ℓ) be as in Lemma C.1. Then Note that this corresponds to the sequence A062991 on OEIS [84].
We therefore find that for all ℓ, ℓ ( ) must satisfy − ℓ ( ) = 1 − − ℓ (1 − ), or We then plug in the form of ℓ ( ) = ∑︀ 2ℓ =ℓ+1 (ℓ) . Eq. (C7) must hold for every , and therefore we can equate the coefficients in front of each term to get a system of equations that can be solved for the values of (ℓ) . We then find that Equating the degrees in , we see the following conditions; Note that one can derive equivalent conditions by requiring that the polynomial ( + 1/2) − ℓ ( + 1/2) is even in . Condition 2 and condition 3 together constitute a linear system of ℓ linearly independent equations. To verify this, one must show that det ̸ = 0 where is the ℓ × ℓ matrix with entries = (︀ ℓ+ )︀ . As noted by Benoit Cloitre in OEIS sequence A000984 [84], det = (︀ 2ℓ ℓ )︀ . For a proof, see Ref. [85]. Since we have ℓ linearly independent equations for ℓ variables , if there is a solution to the four conditions then there is a unique solution. One can then verify that a solution to the four conditions, and therefore the solution, is given by Eq. (C5).
Plugging in this result, we find that where 2 1 is the hypergeometric function and ℓ := 1 ℓ+1 (︀ 2ℓ ℓ )︀ is the ℓ th Catalan number. The simplification from the first to second line follows from the definition of the hypergeometric function given in Eq. (C13). Plugging this exact formula for ℓ ( ) = lim →∞ E 1 Tr ℓ into Eq. (A32), and swapping sums, we arrive precisely at Theorem 2. We have therefore completed the proof.
To get a sense for ℓ , we list a few here. We will call ℓ ( ) the ℓ th order approximation to ( ). From Eq. (A33), lim →∞ E 1 2 ( ) = ∑︀ ∞ ℓ=1 2ℓ 2ℓ ℓ ( ). Thus, = tanh(2 ) is weighting how relevant each approximation is. For small squeezing, most of the weight is concentrated on low-order approximations. The lowest order approximation is 1 ( ) = (1 − ) resulting in a parabolic shaped Page curve. When the squeezing is very large, more and more weight is placed on high-order approximations so that the Page curve begins to resemble the triangle ∞ ( ) = ( ). We see a manifestation of this interpretation as where the latter comes from the full expression in Theorem 2. Meanwhile, the maximal Rényi-2 entropy is max 1 2 ( ) = ( ) log cosh(2 ) from Eq. (B1). As stated, near the endpoints = 0 and = 1, ℓ ( ) is a very good approximation to ( ). Thus, regardless of the squeezing strength, when the subsystem size = is small (or when its complement is small), the average entanglement is very close to maximal.

C.2 Maximum value -Proof of Corollary 3
In Appendix C.1, we derived the exact formula for the Rényi-2 Page curve as an infinite series. Here we will show that the series can be completely simplified when = 1/2. Bailey's theorem says that [63][64][65][66][67]  Plugging this into the Page curve in Theorem 2 at = 1/2 and simplifying with the duplication formula We find that the second term is , which simplifies to 1 2 log cosh(2 ) − log cosh . Subtracting the second term from the first yields log cosh as desired.

C.3 Constant term -Proof of Proposition 4
In Appendix C.1, we found that when all the initial squeezers are equal to , E 2 ( ) = ( , )− ( , )+ (1), and we explicitly computed ( , ). In this section, we determine ( , ) up to a set of constants and conjecture an explicit value of the constants.
In Appendix C.1, we found that, asymptotically in , E Tr ℓ = ℓ ( ) + ℓ ( ) + (1) The Rényi-2 entropy must be symmetric under ↦ → 1 − at every order in and , since the full state on the modes is pure. Therefore, ( , ) must be symmetric under ↦ → 1 − at every order in tanh(2 ), meaning that ℓ ( ) = ℓ (1 − ). The following lemma will therefore be useful here, and we will also find use of it in Appendix D.
Proof. Simplifying ( ) − (1 − ) = 0 with the binomial theorem, we find Equating all degrees of , we find Condition 1 is a system of ℓ linearly independent equations. To verify this, one must show that det ̸ = 0, where is the (ℓ + 1) × (ℓ + 1) matrix with entries = (︀ +ℓ−1 −1 )︀ for 1 ≤ ≤ ℓ + 1 and 1 ≤ ≤ ℓ, and entries ,ℓ+1 = ,ℓ+1 . In other words, the rightmost column is all zeros except for the entry on the diagonal. Inserting this rightmost column is equivalent to adding one new equation to the linear system, where this new equation simply fixes the value of one of the variables. Since the rightmost column is all zeros except for the (ℓ + 1, ℓ + 1) entry, we use Laplace's expansion to find that det = det ′ , where ′ is the ℓ × ℓ upper left submatrix of . One can prove that det ′ ̸ = 0 in a similar way as shown in Ref. [85]. Alternatively, one can use Corollary 11 from Ref. [87], which shows that the matrix with entries (︀ −1 )︀ has zero determinant if and only if there are indices ̸ = such that = . For the matrix ′ , = ℓ + − 1, and therefore the determinant is nonzero.
In summary, we have a system of ℓ linearly independent equations for ℓ + 1 variables. We can therefore uniquely express the solution by fixing one of the variables. Suppose we know the value of (ℓ) 2ℓ . Then one can verify that conditions 1 and 2 are satisfied by (ℓ) 2ℓ , for 0 ≤ ≤ ℓ. With the binomial theorem, this simply becomes ℓ ( ) = When verifying that the two conditions are satisfied by (ℓ) 2ℓ , one finds that the right hand side of both conditions can be simplified in terms of the hypergeometric function via Eq. (C13). For example, condition 2, written as (−1) = ∑︀ 2ℓ = (︀ )︀ ( (ℓ) / (ℓ) ), reduces to (−1) = 2 1 ( − 2ℓ, + 1; − ℓ + 1; 1). As in the proof of Lemma C.2, we find that which proves condition 2. The proof of condition 1 is similar.

= 2ℓ ∑︁
Finally, since the sum of the lengths of the cycles of a permutation ∈ 2ℓ is always 2ℓ, we find From this equation, we can exactly compute (ℓ) on a computer for small values of ℓ. Table C.1 shows the first five of these, which all match Lemma C.4 saying that (ℓ) = (−1) ℓ 4 ℓ−1 . Note that if we change the condition of ( ) = | | in Eq. (C45) to ( ) = | | + 1, then this gives us the term that is linear in and hence is the equation for To complete the proof of Lemma C.4, we need to evaluate Eq. (C45) for all ℓ ∈ N. This is done in Ref. [72] using objects called breakpoint graphs that arise in the study of gene orders in bioinformatics [71]. Roughly, has an interpretation in terms of cycles of breakpoint graphs.
D Variance of the Rényi-2 entropy -Proof of Theorem 6 In this section, we shift our attention away from E 2 ( ) and instead to Var 2 ( ) = E 2 ( ) 2 − (E 2 ( )) 2 , and we will prove Theorem 6. We are again interested in the case where all the initial squeezers are equal to . Using Eq. (A32), this becomes As a direct consequence of Lemma C.1, we find that asymptotically where ℓ,ℓ ′ is a polynomial of degrees ℓ + ℓ ′ + 2 through 2ℓ + 2ℓ ′ in , ℓ,ℓ ′ is a polynomial of degrees ℓ + ℓ ′ + 1 through 2ℓ + 2ℓ ′ in , and ℓ,ℓ ′ is a polynomial of degrees ℓ + ℓ ′ through 2ℓ + 2ℓ ′ in . Furthermore, we find an analogous result for E (Tr ℓ )(Tr ℓ ′ ). Let := 2ℓ + 2ℓ ′ . Using Eq. (A6), we find that and the asymptotic form of Wg function is given in Eq. (A7). In a similar proof to Lemma C.1 but with Eq. (D4) instead of Eq. (C2), we analogously find that asymptotically where ℓ,ℓ ′ is a polynomial of degrees ℓ + ℓ ′ + 2 through 2ℓ + 2ℓ ′ in , ℓ,ℓ ′ is a polynomial of degrees ℓ + ℓ ′ + 1 through 2ℓ + 2ℓ ′ in , and ℓ,ℓ ′ is a polynomial of degrees ℓ + ℓ ′ through 2ℓ + 2ℓ ′ in . For completeness, we prove this result for the ℓ,ℓ ′ term in the following lemma. The proofs for the ℓ,ℓ ′ and ℓ,ℓ ′ terms follow from trivially tweaking the final part of the proof. Proof. Much of the details of this proof are the same as in the proof of Lemma C.1. We will use the asymptotic form of the Wg function, which is written in Eq. (A7). The proof will proceed as follows. First, we will prove that (Tr ℓ )(Tr ℓ ′ ) contains a term proportional to 2 and no terms proportional to for any > 2. Therefore, ℓ,ℓ ′ ( ) is indeed independent of . Next, we will prove that ℓ,ℓ ′ ( ) has no terms for > and ≤ ℓ + ℓ ′ + 1. Throughout this proof, we interpret the delta functions in Eq. (D4) as constraints on the summations. Different permutations on the indices result in a different number of constraints and hence terms with different powers of and .
Recall that Π can only decrease the trace. Getting rid of the Π in and using the cyclic nature of the trace, we see that (Tr ℓ )(Tr ℓ ′ ) ≤ 2 . Furthermore, consider Eq. (D4) with = defined by (1) = 2ℓ, (2) = 1, . . . , (2ℓ) = 2ℓ − 1 and (2ℓ + 1) = , (2ℓ + 2) = 2ℓ + 1, . . . , ( ) = − 1. Then, −1 is the identity, and so Wg( −1 , ) contributes a factor of − . With this , the sum over the and ′ yields a factor of . Finally, the sum over yields Then summing over ′ , we get two factors of . Hence, the term with the specific permutation described above yields a term of the form 2 − = 2 . We have shown that there is a term proportional to 2 and that there are no terms proportional to for > 2. Since we are working asymptotically in , we can therefore ignore all terms proportional to for every < 2. This proves that lim →∞ 1 2 (Tr ℓ )(Tr ℓ ′ ) is independent of and only depends on , which justifies the definition of the function ℓ,ℓ ′ ( ). The only thing left to show is that ℓ,ℓ ′ ( ) has only terms ℓ+ℓ ′ +2 through 2ℓ+2ℓ ′ = . So we only need to show that there are no terms for > and ≤ ℓ + ℓ ′ + 1. We begin with the former.
To look at powers of , it is sufficient to look at powers of . We therefore restrict our attention to the sum over and ′ in Eq. (D4). In order to get the highest power of , we require the least constraints on and ′ (i.e. the least distinct Kronecker deltas). We therefore require to be the permutation so that This permutation is exactly the described above that gave the term proportional to 2 . Hence, is the highest power of that can be achieved.
Next we need to show that ℓ + ℓ ′ + 2 is the lowest power of that can be achieved. The sum over and ′ can give at most a factor of ℓ+ℓ ′ . This is because the first line of delta functions, 1, 2 ′ 1 , ′ 2 . . . −1 , ′ −1 , ′ , reduces the sum over the indices and the indices ′ down to just a sum over ℓ + ℓ ′ indices and ℓ + ℓ ′ indices ′ . The second line of delta functions, 1, ′ (1) . . . , ′ ( ) , cannot be made equivalent to the first line by any choice of ; in fact, the second line imposes all new constraints. Therefore, this line further reduces the sum over the ℓ + ℓ ′ indices and the ℓ + ℓ ′ indices ′ to just a sum over the ℓ + ℓ ′ indices (or the ℓ + ℓ ′ indices ′ , but not both). Hence the highest power of that we can get from the summations over the and ′ is ℓ+ℓ ′ . Putting this together with the fact that asymptotically Wg( , ) is at most − = −2ℓ−2ℓ ′ , we find that any term coming from Eq. (D4) is at most −ℓ−ℓ ′ × (dependence on ). Therefore, any powers of that are less than ℓ + ℓ ′ + 2 can be ignored; if the sum over and ′ yields a term that is for some ≤ ℓ + ℓ ′ + 1, then that term will scale linearly or less with . But from above, we already have terms that are proportional to 2 , and so terms that are linear or less in can be ignored.
Therefore, from Eq. (D2), we find that asymptotically where is a polynomial of degrees + 2 through 2 in , is a polynomial of degrees + 1 through 2 in , and is a polynomial of degrees through 2 in . The variance must be symmetric under ↦ → 1 − at every order in and . Therefore each , , and must themselves be symmetric under ↦ → 1 − . It then immediately follows from Lemma C.3 that ( ) = ( ) ( (1 − )) for some constants ( ) . In the following lemma, we show that and must be the zero polynomial. Choosing any of the equations from the first condition gives a linear system of linearly independent equations.
To verify this, one must show that det ̸ = 0, where is the × matrix with entries = (︀ + )︀ . This was shown in the proof of Lemma C.2.
Therefore, we have linearly independent equations for variables. Hence, if there is a solution, then there is one unique solution. We easily see that = 0 is a solution, and therefore it is the solution. This gives ( ) = 0, completing the proof.

Since
and are zero, we have therefore found that where Cov is the covariance, and we know that ( ) is independent of and . From our expressions for Tr ℓ in terms of the Weingarten calculus, it follows that ( ) ∈ Q. Recall that has two factors of and two factors of¯. Hence, we can exactly compute (2) by integrating over fourth moments of the Haar measure on the unitary group. To do this, we use the Mathematica package RTNI that symbolically computes expressions over the Haar measure [88]. This Mathematica package precomputes the symbolic expressions for Wg( , ) for ∈ 4 . From this, (2) is a sum over powers of Tr Π = with coefficients depending on . One can then simplify this expression and take the limit → ∞ to find that (2) = 1/2. Our Mathematica code for this calculation is provided on GitHub [68].