On the distribution of the mean energy in the unitary orbit of quantum states

Given a closed quantum system, the states that can be reached with a cyclic process are those with the same spectrum as the initial state. Here we prove that, under a very general assumption on the Hamiltonian, the distribution of the mean extractable work is very close to a gaussian with respect to the Haar measure. We derive bounds for both the moments of the distribution of the mean energy of the state and for its characteristic function, showing that the discrepancy with the normal distribution is increasingly suppressed for large dimensions of the system Hilbert space.


Introduction
Haar-uniform random unitary matrices are a resource required for various quantum algorithm [1][2][3]. As an example, the randomised benchmark protocol is a method to test the error rate of a quantum circuit, requiring it to perform a sequence of random operations [4]. Versions of the randomised benchmark are used by the companies IBM [5] and Microsoft [6] to test the functionality of their experimental quantum computing hardware. Other applications of random unitaries include quantum cryptography [7] and the simulation of many body physics [8,9].
In this paper we charachterise the the distribution of the energetic cost of implementing a random unitary transformation. To be more specific, we study the distribution of the expected value of the energy gained from the cycle which sends the quantum state ρ to U ρU † , where U is a matrix extracted randomly with respect to the Haar measure of the unitrary group. Our main finding is that, when the quantum system has many degrees of freedom (that is when the Hilbert space has dimension d 1), then -provided a weak condition on the system Hamiltonian-the distribution of the energy cost of a random unitary matrix is approximately Gaussian.
Implementing a random unitary is not an easy task. Actually, only a small subset of quantum operations (called the Clifford gates [10]) are easy to implement in an actual circuit -the complexity of a quantum circuit is often measured with the number of non-Clifford gates it requires [11]. To overcome these difficulties, it has been theorised the possibility of circuits that simulate a random unitary up to a certain moment of the distribution [12]. A circuit which is able to emulate a uniform distribution up to the t-th moment is called an unitary t-design. If we want to realise a t-design with t > 3, it is still necessary to use non-Clifford gates [13,14], but only a small amount of them [15].

Introduction to the problem
Let A be a d-dimensional quantum system initialized in the stateρ and forced to evolve in time by external modulations of its HamiltonianĤ. Following [24,25], the average amount of work one can extract from the process can be computed as is the mean energy ofρ, andÛ is the element of the unitary group U(d) associated to the applied driving. The allowed values of W U (ρ;Ĥ) are limited by the inequalities where E(ρ;Ĥ) and A(ρ;Ĥ) are respectively the ergotropy and anti-ergotropy functionals of the model. The first one was introduced in Ref. [26] and corresponds to maximum work one can extract from the system by optimizing W U (ρ;Ĥ) with respect to the choices of the control unitaryÛ for fixedρ andĤ. A closed formula for the ergotropy is provided by the expression obtained by settingÛ equal to the optimal unitaryÛ (↓) for which the final state of the cycleÛρÛ † corresponds to the passive counterpartρ (↓) ofρ [24,25], i.e.
where nowρ (↑) is the anti-passive state ofρ, obtained by the unitaryÛ (↑) that reverses the order in which the populations ofρ (↓) are listed, i.e.
with λ (↑) Saturating the upper bound (3) is an important optimization task which can be practically difficult to implement, as it implicitly requires an exact knowledge of the full spectral decomposition of the system Hamiltonian. In this perspective, it is interesting to understand how close one can get from the boundary values (3) by randomly selectingÛ for a model in which bothρ andĤ are assigned. Clearly, as the dimensionality of the system increases, we do not expect such a naive approach to be particularly effective. Still, providing an exact characterization of the associated efficiency is a well-posed statistical question which can be of some help in identifying which physical systems are best suited as successful candidates for implementing quantum battery models [27][28][29][30][31]. In order to tackle this issue, in the present paper we study the probability distribution P (E|ρ;Ĥ) of the mean output energy which originates by random samplingÛ on the unitary group U(d) via its natural measure (the Haar measure dµ(Û )) [32]. A numerical example of this distribution, with d = 7 is showed in Fig. 1. From Eq. (1) it is clear that knowing P (E|ρ;Ĥ) we can then reconstruct the probability distribution P work (W |ρ;Ĥ) of the average extracted work W := W U (ρ;Ĥ) via a simple shift of the argument, i.e.
Our main finding is that, under mild assumptions on the system Hamiltonian, when the dimension of the Hilbert space d is sufficiently large, the central moments of the probability P (E|ρ;Ĥ) can be approximated by those of a gaussian distribution P (µ,Σ (2) ) G (E) having mean value µ and variance Σ (2) equal to those of P (E|ρ;Ĥ), i.e.
where f (E) denotes the mean value of the function f (E) with respect to P (E|ρ;Ĥ) [32] and with G p the scaling factor In order to settle the approximation (10) into firm quantitative ground we prove that for d sufficiently larger than p the discrepancies between the l.h.s and the r.h.s of such equation can be bounded as where fĤ (d, p) is a positive function which in the large d limit scales as O(1/d + η ∆Ĥ ) for p even, and O( √ η ∆Ĥ ) for p odd, with η ∆Ĥ being a functional (see Eq. (74) below) that for typical choices of the system Hamiltonian is very much depressed. Using the theory of generating functions, we also show a stronger result which, aside from bounding the error on each individual moment of the distribution P (E|ρ;Ĥ), directly links its characteristic function with the one of the gaussian function P (µ,Σ (2) ) G (E). The rest of the manuscript is organised as follows. Section 3 introduces the concepts and some known results we will use in the proof. In particular, Sec. 3.2 sets up the problem of calculating the moments of the distribution, which can be expressed as the Haar integral (20). The evaluation of this integral can be set up using the theory of Weingarten calculus, that we introduce in section 3.4. The practical computation of the integrals involved in the proof will require other combinatorical concepts, which will be introduced at the points where they are needed. In Sec. 4 we derive some useful inequalities and discuss the assumptions on the system that are needed to enforce the Gaussian approximation. Section 5 presents a proof of Eq. (13) that applies for the special case in which the input state of the systemρ is pure, and in Sec. 6 we address instead the case of arbitrary input states. Finally, in Sec. 7 we bound instead the distance between the characteristic function e −itE of the distribution P (E|ρ;Ĥ), and the characteristic function of the Gaussian distribution P (µ,Σ (2) ) G (E). Conclusion and outlook are finally given in Sec. 8.

Preliminary considerations
This section is dedicated to clarify some useful mathematical properties of the model.

Basic properties of P (E|ρ;Ĥ)
From Eqs. (2) and (3) it follows that for fixedρ andĤ the range of the random variable E is limited by the inequalities In the special cases where either the input state is completely mixed (i.e.ρ =1/d), or the Hamiltonian is proportional to the identity (i.e.Ĥ = E 01 ), the upper and lower bounds of (14) coincide forcing E to assume the constant value Tr[Ĥ]/d, i.e. imposing the distribution P (E|ρ;Ĥ) to become a Dirac delta From Eq. (8) it is also clear that for arbitrary choices ofρ any rigid shift of the Hamiltonian spectrum results in a translation of the distribution P (E|ρ;Ĥ), i.e.
More generally noticing that for all choices of ∆ 0 , E 0 ∈ R one has we can draw the following formal identity where, generalizing the definition of P (E|ρ;Ĥ), givenÂ andB generic operators, we use the symbol P (E |Â;B) to represent the distribution of the variable E := Tr[ÛÂÛ †B ] induced by the Haar measure dµ(Û ). We observe next that any two input statesρ andρ which have the same spectrum will have the same energy probability distribution, i.e. P (E|ρ ;Ĥ) = P (E|ρ;Ĥ). Indeed under this condition we can always expressρ asVρV † withV ∈ U(d), so that for all functions f (E) one has where we used the fact thatÛ =ÛV ∈ U(d), and the invariance property dµ(Û ) = dµ(ÛV ) of the Haar measure. Similarly due to the cyclicity of the trace appearing in Eq. (2), we can conclude that P (E|ρ;Ĥ) is also invariant under unitary rotations of the system Hamiltonian, leading to the identity which ultimately implies that P (E|ρ;Ĥ) can only depend upon the spectra {λ

Central moments
In evaluating the moments of the distribution P (E;ρ,Ĥ) we have to consider the quantities = j 1 ,···jp k 1 ,··· ,kp where ρ jk , H i , U ij are matrix elements ofρ,Ĥ,Û with respect to fixed basis of the Hilbert space of the system [33]. The integral appearing on the last term of (20) is widely known and admit solution [34][35][36] in terms of the Weingarten functions C [σ] for the unitary group [34,37], i.e.
with S p representing the permutation group of p elements. A precise definition of the C [σ] s and their properties will be given in Sec. 3.4. Here we simply notice that replacing (21) into (20) leads to the expression where forΘ generic operator and σ ∈ S p , we introduced the functional To get a more intuitive understanding of what is this product, we need to introduce an important property of a permutation σ ∈ S p : the structure of its cycles, which is sufficient to specify its conjugacy class [σ]. If the permutation σ has a fixed point, i.e. if there exists an i ∈ {1, 2, · · · , p} such that σ(i) = i, then we say that the permutation σ has a cycle of length 1. A cycle of length 2 (i.e. a transposition) means that there exist two indices i, j ∈ {1, 2, · · · , p} such that σ(i) = j and σ(j) = i. In general given σ ∈ S p its conjugacy class [σ] is uniquely identified via the correspondence [σ] , . . . , α with c[σ] being the number of independent cycles admitted by σ, and with the positive integers α (j) [σ] 's representing instead the lengths of such cycles organized in decreasing order, i.e. α (j) . Because each element in {1, 2, · · · , p} belongs to one and only one of the cycles of σ, we must have that implying that the α (j) [σ] s provide a proper partition of the integer p (represented graphically with a Young diagram [38]). The set of indices {1, 2, · · · , p} can thus be partitioned in c [σ] subsets c (i) , which each c (i) having α (j) [σ] elements. We finally observe that two permutations σ and σ will belong to the same conjugacy class if and only if we can identify a third permutation that allows us to relate them via conjugation, i.e.
The fact that each element i ∈ {1, 2, · · · , p} belongs to exactly one cycle of σ allows us to rewrite Eq. (23) as , where in the last identity we used the fact that the the r-th summation runs over a set of α (r) [σ] cyclical indices. Equation (27) makes it clear that, as explicitly indicated by the notation, the terms Θ[σ] (as well as the coefficients C [σ] , see Eq. (44) below) depend upon σ only via its conjugacy class [σ]. Replaced into Eq. (22), the identity (27) also implies that E p can be expressed as linear combination of products of traces of powers ofρ and H, i.e. explicitly Equation (28) is the starting point of our analysis: we notice incidentally that it confirms by virtue of Specht's theorem [39], that P (E|ρ;Ĥ) (and hence its moments) depends upon ρ andĤ only through their spectra. In particular for the cases p = 1, 2 we get which using the explicit values of the functions C [σ] reported in Tab. 1 leads to

Shifting the spectrum ofĤ
Equation (29) makes it clear that, irrespectively from the specific form of the input state of the systemρ, we can enforce the distribution P (E;ρ,Ĥ) to have zero mean value by setting Tr[Ĥ] = 0 via a rigid shift of the associated spectrum. In particular setting E 0 = µ in Eq. (16) we can write while, setting E 0 = µ and ∆ 0 = 1 in Eq. (18) we get where are zero-trace operators. Following the derivation of the previous section we can then use Eqs. (33) and (34) to write the central moments of P (E|ρ;Ĥ) in the following equivalent forms where ∆ρ[σ] and ∆H[τ ] are now the functional (27) associated with the operators (35), and where in Eq. (37) we changed the summation variable via the introduction of the inverse τ −1 of the permutation τ . Both Eqs. (36) and (37) offer us a huge simplification in the analysis of the problem, with the first having an important application in the special case of pure input states. In particular we notice that since ∆Ĥ is a traceless operators, in both these expressions we can restrict the summation on τ by only including those permutations which have no fixed points (i.e. α (j) [τ ] ≥ 2 for all j). Such elements define the derangement subset S D p of S p and by construction can have at most p/2 cycles, or equivalently where is the minimal number of transpositions (i.e. cycles of length 2) τ is a product of (indeed, if there are more than p/2 cycles in the permutation τ , there must be at least one element i of the set {1, 2, · · · , p} for which τ (i) = i). In the case of Eq. (37) a similar simplification can also be enforced for the summation over σ, as again here one deals with a traceless operator ∆ρ. To summarise the following selection rules hold: so that we can rewrite Eqs. (36) and (37) as respectively. p = 1 :

Combinatorial coefficients
The Weingarten functions C [σ] introduced in Eq. (21) play a fundamental role in the analysis of the moments (42). These terms are in general difficult to compute but a formal expression for them is provided by the formula [41,42]  is reported in Tab. 1. As the number of irreducible representations of S p increases with p, the exact computation of these factor becomes rapidly a computationally infeasible task. For the purposes of the present work, however we do not need to compute the coefficients, indeed we just need to use the following known properties. First of all, since C [σ] is a functional of the conjugacy class, exploiting Eq. (26) we can write for all τ, σ ∈ S p . Second, the overall sign of C [σ] is the sign of the permutation σ, namely where |σ| is the minimal number of transpositions σ is a product of, see Eq. (39). Finally we shall use the fact that for large enough d the following asymptotic behaviour holds true where the integer number C [σ] -sometimes called the Möbius function of the permutation σ -is equal to the product with the α (j) [σ] 's defined by the correspondence (24), and with being the n-th Catalan number. In particular for the identical permutation this implies The scaling (46) can be derived e.g. from Ref. [44] where Collins and Matsumoto proved that when which via some simple algebraic manipulation can be casted in a weaker, but sometimes more useful form:

Bounding the trace terms
Here we establish some useful relations that allow us to bound the terms ∆H[τ ] and ∆ρ[σ] entering in Eqs. (42) and (42) and which will allow us to identify the necessary conditions onĤ that are needed to prove the Gaussian approximation (10). For this purpose we shall relay on the properties of the derangement set S D p and on the inequality [45] |Tr where for q > 0 is the q-th Shatten norm of the operatorΘ.

A useful inequality
To begin with let consider the special subset S D * p of S D p formed by those derangements τ which can be decomposed into p/2 cycles of length 2. If p is even there are (p − 1)!! of such elements, i.e.
On the contrary if p is odd there are no such permutations (at least one cycle must be of odd length): in this case we identify S D * p with the empty set By definition the elements of S D * p verify the condition α (j) [τ ] = 2 for all j and thus, by explicit computation, they fulfil the identity for all operatorsΘ. Now let τ ∈ S D p /S D * p a derangement which cannot be decomposed into p/2 cycles of length 2: for such permutations we can prove that Θ p 2 provides an upper bound for the associated value |Θ[τ ]|, i.e. where is a functional ofΘ which due to (53) fulfils the inequality [46] Equation (58) is a direct consequence of the following more general observation: Given τ an element of the derangement set S D p we have with ηΘ the parameter defined in Eq. (59).
However since τ / ∈ S D * p then some those terms (say the first K ≥ 1) must be strictly larger than 2, i.e.
Invoking (53) we can hence claim and Observe also that since K can be expressed as the number of cycles c[τ ] minus the number c 2 [τ ] of cycles of length 2, and since the latter is always smaller than or equal to the minimal number |τ | of transpositions τ can be decomposed of, we can bound this quantity as follows where in the last identity we used (39). From the above expressions we can hence establish that for all τ ∈ S D p /S D * p we have where we employ the definition (59) and use the normalization condition Eq. (25) to write From Eqs. (63) and (66) we have also Therefore we can claim which replaced into Eq. (67) finally yields Eq. (61).
The derivation of (58) from (61) finally proceed by observing that for τ ∈ S D p /S D * p the exponent of ηΘ appearing in (61) is always greater or equal to 1 when p even, and greater or equal to 1/2 for p odd, i.e.

Condition upon the Hamiltonian
SettingΘ = ∆Ĥ in Eqs. (57) and (61) As we shall see the fundamental ingredient to prove Eq. (10) is to strengthen the inequality (73) to make sure that the contributions to Σ (p) associated with the derangements τ that do not belong to S D * p are very much depressed with respect to ∆Ĥ p 2 , i.e. to impose the constraint or equivalently to have It should be clear that due to the fact that since η ∆Ĥ is always greater than or equal to 1/d 6 , Eq. (76) can only be fulfilled when operating with large Hilbert space, i.e.
Notice however that once the requirement d 1 is met, the regime (76) is easy to achieve, as it can only fail in the special case of Hamiltonians that have few eigenvalues much greater than all the others.

Central moments asymptotic expression for pure input states
In this section we present a proof of our main result that applies in the special scenario where the input state of the system is pure, i.e. ρ = |ψ ψ| .
This constraint allows for some simplifications that help in clarifying the derivation. Indeed thanks to the assumption (78) we have Tr[ρ n ] = 1 for every n, which leads to the useful identity Therefore invoking Eq. (42) we can express the central moment Σ (p) of a pure input state, as the product of a numerical factor that only depends on the geometrical properties of derangements, times a contribution that fully captures the dependence upon the system Hamiltonian, i.e.
This quantity should be compared with the corresponding Gaussian expression (11) with the scaling factor obtained from (12) using the fact that for pure states Eq. (32) gives us Under the above premise in the next section we show that Eq. (13) applies with with η ∆Ĥ the coefficient defined in Eq. (74). It is worth stressing that in this case, the validity of Eq. (13) is not subjected to any constraint on p and d, but of course fĤ (d, p) can become small only if d is much smaller than p.

The geometric coefficient Eq (80)
A closed expression for the first factor on the r.h.s. of Eq. (80) is provided by the formula with being a numerical coefficient that for p ≥ 2 fulfils the inequalities An explicit derivation of Eq. (84) can be obtained by focusing on the trivial case where the system Hamiltonian is proportional to the identity operator, i.e.
In this case in agreement with Eq. (15) the terms (80) vanish leading to the following identities where we used the fact that now µ = E 0 . Invoking hence Eqs. (22) and (79) we can then write Our next step is to compute τ ∈Sp H[τ ]. To begin with we write the summation over τ by grouping together those permutations which are characterized by the same value of |τ | = k, decomposing S p into the subsets which replaced into (91) gives where we invoked the useful property [47] Equation (84) finally follows by substituting (93) into (89) and performing some trivial simplifications.
We conclude by noticing that the bounds (85) on D(d, p) can be established by observing that it can be expressed as which immediately reveals that it is always smaller than or equal to 1 for all p ≥ 2. The lower bound instead follows by using the inequalities (1 + x) α > 1 + αx and then and noticing that in case of p even we have (95) For p odd we have instead where the first inequality is obtained by invoking (95).

The Hamiltonian coefficient Eq (80)
In studying the Hamiltonian contribution to Eq. (80) for p even we split the summation in two parts writing where we used the identities (72) and (55) to compute the first contribution. The second contribution instead can be bounded by invoking Eq. (73) to write is the number of permutation in S D p /S D * p which we bounded by exploiting the fact that the number of total elements of the derangement set S D p is [48] For p odd instead S D * p is the empty set and we get

Derivation of Eq. (83)
Exploiting the results of the previous sections we can conclude that for p even the following inequalities apply which again corresponds to set fĤ (d, p) of Eq. (13) as in Eq. (83).

Asymptotic expression for the central moments for arbitrary (non necessarily pure) input states
Here we present the general proof that, for arbitrary (non necessarily pure) states, under the assumption (106) the centred moments of the distribution P (E|ρ;Ĥ) are well approximated by the Gaussian relations (11) whose scaling factors (12) can be conveniently expressed as upon rewriting Eq. (32) in the compact form In particular we shall show that under the condition the inequalities (13) hold true with the function fĤ (d, p) defined as for p even and η ∆Ĥ defined as in Eq. (74), and , (108) for p odd. Notice that weaker, but possibly more friendly, expressions can also be obtained by using the following (generous) upper bound for the Catalan numbers [49] and observing that under the condition p 2 < d, provided that 1 − 6d −1/4 > 0 (i.e., d > 6 4 = 1296) we can also write which does not depend on p.

Case p even
In Sec. 4.2 we noticed that if p is even the special set S D * p is not empty and it is expected to provide the most relevant contributions in the summation over τ that defines Σ (p) . Accordingly we start splitting the summation (42) in two parts, i.e.
where the first collects all the terms τ associated with the elements τ ∈ S D * p , i.e.
which we simplified invoking Eq. (57), and where the second contribution includes instead all the remaining derangements τ , i.e.
Observe next that since all the permutations τ entering in the sum (112) belong to the same conjugacy class S D * p , we have where in the first identity we invoked (26) to write τ = σ 1 τ σ −1 1 where τ is a fixed element of S D * p and σ 1 ∈ S p , where in the forth line we introduced the permutation σ = σ 1 σσ −1 1 which inherits from σ the property of being a derangement, and where finally in the third and fifth identity we use the fact C [σ] and ∆ρ[σ] are functional of the conjugacy classes. From the above identity we can conclude that σ∈S D p C [στ −1 ] ρ[σ] is independent from the specific choice of τ ∈ S D * p , i.e.
Equation (112) can then be simplified as follows Now we single out from the above summation the term σ = τ from the rest identifying the contributions where [1 p ] is the conjugacy class of the identical permutation. Accordingly Eq. (116) becomes which replaced into Eq. (111) results into the following decomposition of the p-th centred moment of the distribution P (E;ρ,Ĥ),

Asymptotic behaviour of the leading term
Here we analyze the asymptotic behaviour of Σ (p) * of Eq. (117) which, as will shall see is the leading term of the centred moment Σ (p) . Indeed given that τ is by definition an element of S D * p from Eq. (57) we get while from Eqs. (49) and (45) that in the large d limit scales exactly as G p of Eq. (104) and hence as Σ and hence

Asymptotic behaviour of Σ (p) (discarded σ's)
Here we evaluate the asymptotic behaviour of Σ (p) * of Eq. (118). Let's start observing that where in the second inequality we used the results of Sec. 4.1 to claim that while in the third inequality we extended the summation over σ outside the derangement set, while still keeping out the terms which makes στ −1 the identity permutation Id. Next step is to provide an estimation of σ∈Sp/{Id} C [σ] . For this purpose let us decompose S p in terms of the subsets S p (k) introduced in Eq. (90) and notice that From this it hence follows that where in the first inequality we adopted Eq. (51) which holds under the condition (50); in the second we used [50] max σ∈Sp(k) and compute the total number of elements in S p (k) via the String number of the fist kind Observe that, for p constant, in the large d limit the function multiplying G p on the r.h.s. tends to zero as O(1/d). Indeed for noticing that and using the inequality where, using Eq. (135) and assuming (132), the last coefficient can be further bounded from above as that shows that in the large d limit this term converges to 1. To summarize we can hence conclude that with which again is valid under the condition (106). with exactly matching the expression given in Eq. (13).

Case p odd
The proof of Eq. (13) in the case of p odd is easier to treat as now the set S D * p is empty, i.e. a condition which we can summarize by saying that Σ (p) * = 0. Accordingly we can treat the whole Σ (p) as we treated the component ∆Σ (p) of the even p case. In particular the entire derivation of Sec. 6.1.3 can be repeated with the minor modification that according to (58) the coefficient ηĤ entering into (142) gets replaced by √ ηĤ . Therefore in this case we have with ∆f (d, p) as in Eq. (143), which corresponds to the expression given in Eq. (108).

Moment generating function
In this section we shall estimate the distance between the generating function of the moments of the distribution P (E;ρ,Ĥ), i.e.
and the one associated with the Gaussian distribution P (µ,Σ (2) ) G where Σ

(p)
G are the Gaussian moments defined in Eq. (10). In particular we shall show that, for values of |t| such that |t| ≤ min where and N := min then the following inequality holds: The error (152) is also valid for the characteristic function of the distribution, namely e −itE [51].

Initial considerations
We shall estimate the maximum distance between the moment generating functions with the series All the estimations we made in Sec. 6 are valid only under the assumption (106). At some point, the momenta Σ (p) will become much more different from Σ (p) G , because the distribution P (E|ρ;Ĥ) has a compact support, while the gaussian distribution P (µ,Σ (2) ) G (E) has infinite tails which become more and more important for the momenta of higher order. For this reason, it is convenient to consider separately the terms of the sum on the r.h.s. of Eq. (153) with p not fulfilling (106) from the rest writing where, given the cutoff (151), we set and, invoking the decomposition (111), with Σ (p) * and ∆Σ (p) ) defined as in Eqs. (112) and (113) -notice that Eq. (156) was simplified by using the fact that for p odd both Σ (p) * and Σ

Bound on Γ >
When p is large enough the condition (106) brakes and we can not use anymore the bounds of Sec. 6. However since the spectrum of the random variable E is limited as in Eq. (14), for all p we can write with ∆E p max as defined in (150). Therefore we can write for p odd, and for p even. Exploiting these inequalities, when (149) holds we can estimate the quantity (155) as where and with Σ (p) * and Σ (p) * as defined in (117) and (118).
First we provide a bound for the term Γ * < . Retrieving from (124) the bound on G , and using the expression (105) for the variance Σ (2) , we have in the second inequality we used the fact that the condition Regarding Γ * < , recalling equation (136), and expliciting the definitions (137) and (104) of the quantities f * (d, p) and G p we have that from which, using the bound (109) on Catalan numbers we can write where in the second inequality we let n = p/2 and used the fact that, since p ≤ N ≤ and in the third inequality we used √ 2n ≤ n.

Bound on ∆Γ <
From the definitions (157) and (113), we can write the term ∆Γ < as We do not need the explicit values of d p,k . They are implicitly given by the coefficients in the expansion of the exponential generating function [52] In wiev of (175), we want to keep in the sum only the derangements with less than p/2 cycles (τ ∈ [2 p/2 ]), whose generating functions is Using (177) into (175), with the identifications x = 4t ηĤ Σ (2) and c = η −1 H , we can conclude that

Conclusions
We derived a bound for the distance between the energy distribution in the unitary orbit of a quantum state and the normal distribution with the same variance, showing that for large dimensions of the Hilbert space the difference in the charachteristic function is suppressed as d −3 . We have also characterised the individual moments of the distribution: these result apply also to unitary t-design, which mimic the uniform distribution of unitary transformations up to the t-th moments. Our findings, therefore, are suitable to be be employed for certifying that an unitary t-design behaves as expected. Since the results do not depend on the specific form of the Hamiltonian (but only on a very general hypotesis on its eigenvalues), they can be applied to every observable that can be measured on the system. The validity of our results is limited to the specific (but relevant in the field of quantum computing) case in which the evolution of the system is described by a random unitary matrices drawn from the Haar-uniform ensemble (also known as Circular Unitary Ensemble [53,54]). In our opinion, it would be interesting to perform a similar analysis on the work distribution in the case in which the time evolution of the system is given by an Hamiltonian drawn from the Wigner-Dyson Gaussian Unitary Ensemble [55,56], which describes quantum chaotic systems lacking time-reversal symmetry [57,58], or from the Gaussian Orthogonal or the Gaussian Symplectic Ensembles, which model the evolution quantum chaotic systems in presence time-reversal symmetry [58]. The combinatorical tools necessary for this analogous task have been in part already developed [59], but significant more work may be required.
Another possible future developement of this work is generalizing it to Hilbert space of infinite dimension. An immediate difficulty is that the Haar measure on the unitary group of infinite dimension is not well-defined [60]. However, a class of states of particular interest are the gaussian states on systems charachterised by a N -modes quadratic bosonic Hamiltonian [61]: in this setting, it could be of some interest to study the distribution of their energy over the group of symplectic transformations [62].