Ergodicity probes: using time-fluctuations to measure the Hilbert space dimension

Quantum devices, such as quantum simulators, quantum annealers, and quantum computers, may be exploited to solve problems beyond what is tractable with classical computers. This may be achieved as the Hilbert space available to perform such `calculations' is far larger than that which may be classically simulated. In practice, however, quantum devices have imperfections, which may limit the accessibility to the whole Hilbert space. We thus determine that the dimension of the space of quantum states that are available to a quantum device is a meaningful measure of its functionality, though unfortunately this quantity cannot be directly experimentally determined. Here we outline an experimentally realisable approach to obtaining the required Hilbert space dimension of such a device to compute its time evolution, by exploiting the thermalization dynamics of a probe qubit. This is achieved by obtaining a fluctuation-dissipation theorem for high-temperature chaotic quantum systems, which facilitates the extraction of information on the Hilbert space dimension via measurements of the decay rate, and time-fluctuations.


Introduction
The ability to control and manipulate microscopic systems at the single particle level is an essential requirement for many quantum technologies. Experimental setups where atoms or qubits can be arranged in ordered structures and studied in quantum non-equilibrium states include neutral atoms in optical lattices [1][2][3], trapped ions Charlie Nation: C.Nation@sussex.ac.uk Diego Porras: D.Porras@iff.csic.es [4][5][6][7], Rydberg atoms [8,9], and superconducting circuits [10,11]. These systems can be used for the quantum simulation of many-body models, or different forms of digital or adiabatic quantum computing. Most of these physical setups have limitations in the accessibility to certain observables. Thus, having extra tools to characterize quantum systems in a simple and efficient way can be useful in the diagnosis and certification of quantum devices.
One of the most prominent properties of a quantum device is its size in terms of the dimension of the associated Hilbert space. The size of a quantum computer or simulator is often given in terms of number of qubits, such that the Hilbert space dimension is 2 N for N qubits. This is a measure that ignores the effect of disorder or the possible lack of connectivity between different zones in the device.
A more useful quantity would be the number of eigenstates of the Hamiltonian that take part in the quantum dynamics, which is bounded in this case by 2 N , but would exclude the degrees of freedom that do not contribute to the evolution of the initial state. Thereby establishing an effective Hilbert space dimension that more accurately describes the complexity of the system, in terms of some effective fully connected Hamiltonian. This Hilbert space dimension is, however, an elusive measure in realistic experimental situations.
In this work we show that the equilibration dynamics [12][13][14][15][16][17][18] of a quantum system can be used to extract such information on the dimension of the Hilbert space of a quantum device, in terms of the effective number of states that contribute to the dynamics of a local observable. Indeed, advancements in quantum technologies described above have inspired a bounty of theoretical work in the field of quantum thermalization [19][20][21][22][23][24][25][26][27][28][29][30]. In the following, we aim to help 'bridge the gap' between theoretical and experimental work in this field [31].
We assume a quantum quench scenario [8,21,28,32] in which a quantum system is initialized in a fully-decohered, infinite temperature state, except for a subsystem that acts as a sensor and is prepared in a pure state. For simplicity, we assume that this subsystem is a single qubit, which we refer to as the 'probe' qubit. The relaxation dynamics of the probe qubit depends on the details of the underlying structure of the Hamiltonian, however, in the most generic case of nonintegrable systems, an appropriate description can be given in terms of random matrix theory (RMT) [25,30,[33][34][35][36][37][38]. We show that the timefluctuations of the probe in the long-time limit contain information about the Hilbert space dimension of the device.
Our article is structured as follows. Firstly, in Section 2 we present the basic scheme and summarize our main result, which relies on an infinite-temperature fluctuation dissipation theorem (FDT) [39] for the dynamics of the probe qubit in order to extract information on the scaling of the effective Hilbert space of a quantum device. We continue, presenting numerical calculations that validate our predictions via exact diagonalization of a spin chain Hamiltonian in Section 3. We then outline in more detail our RMT model in Section 4. In Section 5 we derive an expression for the time dependence of generic observables in chaotic quantum systems, and discuss how this can be exploited for our measurements. We then derive the FDT, and extensions from the RMT model, and to finite temperatures, in Section 6. Finally, we summarise our findings in Section 7. Further details and proofs are included in Appendices. We note that this is arranged such that our key findings can be understood from sections 2 and 3, with the detailed calculations presented later in the text.

Proposed setup
We assume that we have a quantum system (from here on, 'quantum device') made up of a single 'probe' qubit, initialized in a pure state, which at t = 0 is coupled to the rest of the quantum device (referred to as the 'bath'). We initialize the bath in an infinite temperature state. This is sketched in Figure 1. This can be routinely achieved, for Figure 1: Illustration of our proposed setup (quantum device). A single qubit, labelled 'Probe', is coupled locally to a part of a larger non-integrable bath, initialized in an infinite temperature state ρ B ∼Î (this restriction is removed below). Experimentally, for our protocol, one needs access only to an observable of the Probe qubit. example, in a quantum computer device that has not been properly initialized. Since quantum devices always suffer from some kind of decoherence, creating an infinite temperature (or fully decohered) state is typically a simple task. We then let this qubit evolve in time and reach an equilibrium state. The initial state is thus, ρ B is the density matrix of the bath in the fully decohered state, where |φ B,α B is a set of orthonormal wave functions in the bath, which we take as the eigenstates of the bath Hamiltonian H B . N B is the Hilbert space dimension of the bath, and is the parameter on which we wish to infer information via measurements of the probe qubit. Later on the fully decohered state condition will be relaxed, allowing for finite temperatures. The system evolves under the interacting Hamiltonian, where H 0 =1 ⊗ H B is the Hamiltonian of the uncoupled probe + bath device, and we assume that the probe qubit does not evolve at all in the absence of coupling to the bath, which is given by the operator V . A prominent role in the following discussions will be played by the eigensystems of the interacting Hamiltonian, and the eigensystem of the uncoupled probe-bath system H 0 |φ α = E (0) α |φ α . (5) All throughout this work we will use the convention that indices µ, ν refer to summations over eigenstates of H, whereas α, β refer to eigenstates of H 0 . We focus on the quantum dynamics of the expectation value of a probe observable, which we take for concreteness to be σ z (t) = Tr(ρ(t)σ z ). Note that, however, the results should not depend on the particular choice for the probe observable, since H 0 has no probe energy term. The probe-bath coupling V , may introduce a dependence on the chosen observable, but the results below should hold in the general case of quantum chaotic or non-integrable systems.
The quantity under study, the time-averaged fluctuations of the probe observable, is defined by where µ σz (T ) = 1 Assuming only that the many-body eigenenergies are nondegenerate, and that also their energy gaps are non-degenerate, we may express the time fluctuations in terms of matrix elements between eigenstates of the coupled probe-bath system [25], where ρ µν = ψ µ |ρ|ψ ν , and (σ z ) µν = ψ µ |σ z |ψ ν .

Summary of main results of this work
In many practical situations the quantum device is a non-integrable, quantum chaotic system. We expect that in such cases a qualitative description can be obtained by assuming that V is a random matrix. This assumption leads to a statistical theory for the many body wave functions in Eqs. (4,5) [40], where summations are understood to be taken from 1 to 2N B , the dimension of the Hilbert space of the device. We will refer to c µ (α) as quantum chaotic or random wave functions. We assume that V belongs to the Gaussian Orthogonal Ensemble (GOE), appropriately scaled by a coupling strength g. In Ref. [40] we showed that this model can be solved and it allows us to calculate matrix elements in the interacting basis. Special care must be taken to translate the probe initial state defined in Eq. (1) in the random matrix formalism. Without loss of generality we can assume that non-interacting eigenfunctions are ordered such that odd (even) values of α correspond to the probe qubit in state | ↑ (| ↓ ). This leads to the initial condition In the following sections we prove that random matrix theory leads to a relation between the time-fluctuations and the typical decay rate of the probe qubit. This relation is the main result of this work: where dEΓ(E) −1 is the average inverse decay rate of the qubit. This average is an unbiased average over all initially populated initial state energies, which spans the entire energy range ∆E of the device due to the initial infinite temperature bath state. We will see that this unbiased average originates from the decay to equilibrium of each state |φ α contributing to the initial state (9). In Sections 5 and 6 we will see that this can be related to the decay rate obtained from a fit to the decay to equilibrium of a local probe observable in our current set up. This quantity is thus simply an average of the decay rates experienced by the probe qubit.
The quantity χ(N ), with N the total number of qubits, depends on the size of the system in the following way, D(E) is the average density of states (DOS) of the system, defined analogously to the average decay rate above (see also Eq. (61)). N B is the bath Hilbert space dimension, and finally, C is a constant of order one that does not depend on the size of the system or coupling strength.
Eq. (10) can be understood as a fluctuationdissipation relation, which relates the timefluctuations in the steady-state with the decay rate after a quantum quench. The ratio between fluctuations and average decay time allows us to quantify the dimension of the Hilbert space over which the ergodic quantum dynamics takes place. In a physical system if V couples the probe qubit to the whole spectrum of the quantum device, we expect that the function χ ∝ e cN B , where N B is the number of qubits in the part of the device that acts as a bath.
Our main result, Eq. (11), can be obtained by following these steps: (i) We consider the random matrix model with an homogeneous coupling matrix V , and calculate δ 2 O (∞) by extending the formalism developed in [40,41] to mixed states (see Section 6.1). To carry out this calculation we assume a constant DOS, D(E) = 1/ω 0 , with ω 0 the mean energy separation.
This calculation allows us to predict an exponential decay for a probe observable of the form: (12) where O DE is the diagonal ensemble of the observable O, which we find to be equal to the long-time average of O as required, and O(t) 0 refers to the free evolution of the observable O under the Hamiltonian H 0 . This result is derived in Section 5. This is analogous to the result in Reference [41] for pure-states. We note that the same Equation has similarly been obtained in Ref. [35], which allows also for the perturbation matrix V to be inhomogeneous. In the specific case of interest here, Eq. (12) becomes, as σ z (t) 0 = 1 and σ z DE = 0.
Using this random matrix model we also obtain a preliminary version of the infinite temperature fluctuation-dissipation theorem, Eq. (11), with (ii) In a realistic system both the DOS and the coupling strength will depend on the energy. This leads to energy-dependent qubit decay rates, Γ(E), and DOS, D(E). Due to the infinitetemperature initial condition, it is not possible a priori to approximate those quantities to any single value, since the quantum dynamics of the probe qubit result from contributions from all possible initial states. It is thus necessary to extend the RMT formalism to allow for variations in both the DOS and the decay rate in energy over the width of the initial mixed state.
It is also important to note that when applied to a realistic system N B should be thought of as the number of eigenstates that may contribute to the dynamics of a local observable -it is thus a measure of the effective Hilbert space dimension in the sense of the number of degrees of freedom that may be explored from a given initial state. This measure accounts for effects such as locality of interactions, and disorder, which will reduce the Hilbert space available for evolution of the system. As such, N B will in general be bounded by the total Hilbert space dimension, but is a more realistic measure of the size of the explored Hilbert space in the thermalization dynamics of the device.
In Section 5 we show that the decay rate observed from such an initial state, defined with an energy dependent coupling strength, is the thermal average decay rate over the energy width defined by the initial state.
We formulate the generalization of the FDT in Section 6.2, however a brief summary of the approach is as follows: To account for the energy variation of the DOS and decay rate we instead make the assumption that both change slowly in energy with respect to the energy width of a single eigenstate. In this case, one can reformulate the theory such that the random wave functions have a smoothly varying width. In effect this is the statement that a given random wave function contributes as if it was the eigenstate of a random matrix model with a constant DOS and decay rate, as over the width of the wave function itself these parameters do not change appreciably.
In this case, we see that the average DOS and decay rate over the initially populated initial states contributes to the FDT, as in Eq. (10). This is extended to finite temperature systems, where instead a thermal average can be seen to contribute, in Section 6.3.
Finally, we may relate the two decay rate averages, as the thermal average observed from the decay differs slightly from that appearing in Eq. (10). These can be seen to be related by a constant factor that does not depend on the coupling strength, or bath size, and thus scaling informa-tion of the Hilbert space dimension may be recovered. This is shown in Section 6.3, and Appendix E, where the FDT is recast in terms of thermal averages.
In Section 5 we show that when an exponential decay of the form Eq. (12) is observed, this indicates that Γ(E) is approximately constant over the bulk of the initially populated states. In this case we are able to extract the numerical value of the Hilbert space dimension directly, as the averages ove Γ(E) occurring in the FDT, and observed from the decay, are equal.
In summary, we observe the emergence of a classical fluctuation-dissipation theorem, relating the time-fluctuations and decay rate of our probe observable σ z . The susceptibility χ(N ) in Eq. (11) can be seen to be related to the Hilbert space dimension of the bath, N B , and thus measurements of the decay rate, Γ, and fluctuations δ 2 σz (∞), which are both obtainable from the time evolution, can be exploited to obtain information on the device Hilbert space dimension, N B .

Numerical Experiments
Before going into the technical details of our derivation, we present numerical evidence that confirms the validity of the random matrix approach and our main results. In this section, we show the application to a spin-chain system using exact diagonalization [42,43]. From this, we observe the FDT numerically using a realistic experimentally observable model.
In Fig. 2, we show the manifestation of Eq. (10) in a spin-chain system described by the Hamiltonian H = H S +H B +H SB , where H S = 0 is the system Hamiltonian (acting as our probe), H B is our bath Hamiltonian, given by which acts on sites with index > 1, which is the probe index. The probe and bath are coupled by the interaction Hamiltonian, We see in this case, then, that one does not require the ability to change the device length in order to observe our predictions experimentally. Parameters: B Here N m is the device site where the probe is coupled, which we set as 2 throughout. This spinchain model may be related to the random matrix toy model, H = H 0 + V , via the prescription H 0 ⇔ H S + H B , and V ⇔ H SB . In particular, we see that, as B , we expect that if all of the available Hilbert space is being utilized in the unitary dynamics we will observe the following scaling: This is the relation that we test in Fig. 3.
It is important to note that this exponential scaling of χ(N ), Eq. (16), is expected from not only the contribution of N B , but also from the average DOS, D(E). This average is often trivially obtained, as for example, for an ensemble of N two-level systems where ∆E is the range of energies available E max − E min (which may itself change with N ), regardless of the microscopic properties of the DOS. We thus also study the quantity χ(N )D(E), as this quantity has no dependence on the DOS, and an observation of the exponential scaling in system size is confirmation that, indeed, N B ∝ e cN . This is shown in Fig. 4, where we observe an exponential scaling of the Hilbert space dimension, with c ≈ 0.62, compared to ln(2) ∼ 0.69 if the entire Hilbert space were explored in the dynamics.
We further observe in Figs. 3 and 4, that the FDT similarly applies at finite temperatures β = 1 k B T > 0. The extension of our theoretical approach to this case is discussed below, with additional details given in Appendix E. Indeed, we can show that for high temperatures, such that β −1 Γ, we obtain an FDT of the same form as Eq. (10), by employing a high energy cut-off (ρ B ) αα ∼ e −βEα to the bath state occupation.
For finite temperatures we show below that the FDT depends on the partition function Z β itself, rather than the Hilbert space dimension. Indeed, one can see that in the infinite temperature limit Finally, we see that when the observable is found to decay exponentially to its equilibrium value, this indicates that the decay rate is approximately constant over the bulk of the initially occupied states. This is shown in Section 5. Exploiting this observation, we are able to directly obtain the Hilbert space dimension, as for an infinite temperature initial bath state the average Γ −1 is equal to the measured decay rate. The bath Hilbert space dimension N B , as calculated from Eq. (10), is plotted for varying de- for the infinite temperature case, β = 0, and thus confirms the exponential scaling of In this case, we observe an exponential scaling for only high temperatures, and confirm that for low temperatures χ(N )D(E) is independent of N . Note that here we have Γ ∼ 0.2 so the high temperature limit is defined by approximately β 5. Parameters: as Fig. 2 with J vice sizes in Fig. 5. Here we observe that N B indeed increases exponentially with systems size, yet is somewhat smaller than its maximum possible value 2 N −1 , which is expected to the locality of interactions within the chain.
We again note that in Fig. 5 the measurement of N B is a measurement of the explored Hilbert space, or the total number of eigenstates that contribute to the evolution of the initial state. Thus, for a maximally connected device this would be 2 N −1 , whereas locality of interactions in this case restricts some areas of the Hilbert space.
We note that for models where Γ(E) is not approximately constant in energy, which would be marked by a deviation from the exponential decay in Eq. (12), one would still have access to Figs. 2-4, and thus scaling information of the Hilbert space dimension, yet the numerical value N B would be obscured. This is explained in more detail in Section 5, where we see how information on Γ(E) is extracted from the decay, and in Section 6 and Appendix E, where we see how this relates to the observed FDT. The key detail is that when the FDT is expressed in terms of the same value measured from the decay it differs by a constant, thereby leaving the scaling of N B with 8π , which can be easily obtained from the observable (see Section 6). For a fully connected device the maximum possible N B is N B = 2 N −1 , the region below this limit is shaded. We observe that the measured Hilbert space dimension indeed scales exponentially with system size (dashed line is exponential fit), though is somewhat smaller than the maximum dimension of a fully connected device. Parameters as in Fig system size or decay rate accessible, yet obscuring the numerical value.
We note that in Ref. [41], the current authors obtained a FDT for pure states, which can be seen to be recovered in the low temperature limit, β −1 Γ, for which χ(N ) does not depend explicitly on the Hilbert space dimension N B . This can also be analytically seen to be the same as the low temperature limit of our treatment below, which indicates that there is a smooth transition between these two cases. This is indeed observed in the numerics of Figs 3 and 4.

RMT Approach
Our approach relies on the calculation of correlation functions from a statistical theory of random wave functions c µ (α). Here we summarize the essential ingredients to our model, and give details on the calculations in the sections below.
Our theory, developed in Ref. [40] by extending Deutch's RMT model [44][45][46], can be used to obtain arbitrary correlation functions c µ (α)c ν (α) · · · V , where · · · V denotes the ensemble average over an ensemble of random matrix perturbations, V , for a N × N Hamiltonian of the form In practice, those correlations allow us to calculate any dynamical quantity of interest within the RMT formalism.
To illustrate the use of such correlation functions we briefly consider the simple example of the diagonal observable matrix elements O µµ . These can be written as Now, using the self-averaging property of random matrices, which we prove for this model in Appendix C, we see that Note that only the random wave functions c µ (α) remain inside the ensemble average, as all other factors are in the non-interacting basis and thus do not depend on V . We thus observe that the diagonal observable matrix elements depend on the correlation function c µ (α)c µ (β) V (note that we will see how to deal with the summation and noninteracting observable elements in Section 4.3 below). More generally, when calculating more complicated quantities, we have that there is also a nontrivial contribution of a four-point correlation function of the form where Λ(µ, α) is defined as is found for a homogeneous perturbation V [40,45], selected from the GOE. We obtain that the four-point correlation function, Eq. (19), can be described in terms of product of twopoint correlators Λ(µ, α), as if the random wave function distribution was purely Gaussian, plus a correction term originating from the effective interaction due to the mutual orthogonality of random wave functions. We will see below that an approach in terms of Gaussian and non-Gaussian contractions can be formulated to describe more general correlation functions.
We note that our theory can be extended to account for different forms of the quantum chaotic wave function Λ(µ, α). This may appear, for example, for non-homogeneous V . In this case, the form of the function Λ would change, however the algebraic structure of our theory would remain.
In order to evaluate Eq. (7), we thus use Eq. (8) to write δ 2 O (∞) in terms of the random wave functions c µ (α), and non-interacting matrix elements ρ αβ and (σ 2 z ) αβ . We then use the self-averaging property of random matrices (which we discuss in A, and prove in Appendix C), and obtain the relevant correlation functions

Computing Correlation Functions
As we have seen, it is important to have a systematic approach to obtaining correlation functions for this model. This is a non-trivial task as the random wave functions of our theory are not Gaussian independent variables, but include an effective interaction due to the orthogonality condition ψ µ |ψ ν = δ µν [40]. Our approach to the statistical theory of random wave functions is summarized in Appendix A.
Below, we present such a systematic approach to obtaining arbitrary correlation functions in terms of contractions representing the Gaussian and non-Gaussian terms in the four-point correlator (which is the largest non-factorizable correlation function of our theory).
The four-point correlation function of Eq. (19) may be understood in terms of the contractions of non-interacting indices, indeed it can be seen to be the sum of a Gaussian contraction β)δ αα δ ββ and non-Gaussian contractions, given by where We reserve the double line contraction notation of Eq. (21) for the non-Gaussian case. Note that these must occur in pairs of contractions between different interacting indices µ = ν. Now, we can see from Eq. (21) that each contracted pair of indices contributes a Kroneckerδ symbol, and thus, when the correlation function is summed over its non-interacting indices, the number of summations is reduced. We see that as each Λ contributes a factor on the order O( ω 0 Γ ), and a summation on the order O( Γ ω 0 ), a reduced summation will act to render a term negligible in comparison to a term with no such reduction. Further, we see that the contribution of the non-Gaussian term Eq.
, whereas that of the Gaussian term is ∼ Λ 2 , and thus O( , and as such, one can see that for the non-Gaussian contractions to contribute, they must be acted on my an extra summation. Indeed, one can see that this occurs for one of the two non-Gaussian terms when one has repeated summations, i.e. α , β → α, β in Eq. (21).
For further details we refer the reader to Ref. [41], and the calculations in Appendices B and C. Here we have seen the key intuition, however: that repeated indices in correlation functions leads to the dominant contribution of contractions that would otherwise have contracted the pair of equal indices.

Assumptions on Observables
After obtaining the relevant correlation function, one needs to perform the summations over remaining indices. See, for example, the simple case of Eq. (18) above. To perform the summations, certain assumptions on the form of observable matrix elements in the non-interacting basis must be made. We note that in this basis, local system observables are usually of a known form.
Our theory relies on assumptions that we expect to be satisfied for such local observables.
The key assumption is related to the behaviour of matrix elements, which must have a well defined average, that does not vary pathologically with energy. A more formal definition of our assumption can be written in terms of the function Λ, as, (23) with E µ := Eµ+Eν 2 , and We will see that this assumption will be necessary in order to compute summations over the noninteracting indices. In this section we explain in more detail the requirements on the form of O αα for Eq. (23) to be valid, as well as the physical interpretation of the assumption. The essential assumption here, which we label smoothness of O αα , as in Ref. [41], is that the microcanonical average [O αα ] µ changes slowly over the width Γ of the function Λ(µ, α)Λ(ν, α). We showed in Ref. [41] that this is the case under the assumptions, which thus leads us to two reasonable conditions, 1. There are many states in the energy width Γ 2. The microcanonical average changes slowly over the width Γ.
We note that the latter condition, combined with the fact that the microcanonical average and time average are equal (which is shown below), is equivalent to the statement that the time-average of the observable is not sensitive to the particular initial state (microstate), rather, it's macroscopic energy. In fact, one can see that the conditions (25) are precisely those required in order to define a microcanonical average that does not vary pathologically with small changes in the energy window. In this sense, this assumption is the minimal assumption one would expect to require for thermalization to a microcanonical average that changes smoothly with initial state energy to occur.
We further note that in the consideration of time evolution below, we will consider more general observables that are not necessarily diagonal in the non-interacting basis, but fulfil a sparsity condition. This can be written as where for a given observable there is a non-extensive number N O of groups of non-zero matrix elements at given energy widths, such that after the course graining procedure the observable matrix elements are non-zero for energy gaps E α −E β that are the possible energy gaps of H S . This form can be seen [41] to be reasonable for local observables. We note that it is of course possible to find observables that do not fulfil this assumption, although it is easily seen to be true for e.g. local Pauli operator observables. We will see below that our treatment of time evolution could potentially also capture a wider range of observables as well, if the form in the non-interacting basis is known. In the following we refer to observables fulfilling the above assumptions as 'generic' observables.

Equilibration Dynamics
In this section we present a description of the time dependence of 'generic' observables as defined above, from an arbitrary initial condition This calculation may be performed by exploiting the methods outlined in Section 4. The general approach may be summarized in three steps: i) Writing the observable time dependence in terms of parameters in the non-interacting basis, ii) computing the relevant correlation functions (see Section 4.2), and iii) performing summations using the assumptions on observables (see Section 4.3).
Proceeding as such, we write the time dependent density operator in the form, Noting the so-called diagonal ensemble contribution is defined by, which can be seen to be equal to the long-time average value of the observable assuming no degenerate energy levels, we thus define We see that, using the self-averaging property, this depends on the four-point correlation func-tion given by Eq. (19), such that The third term can be shown to be negligible, proof of which is given in Appendix D. We note that it is this bound that requires the sparsity assumption above. We now use the smoothness assumption, exploiting Eqs. (23) and (24), we obtain, where to obtain the first term one may note that Λ(µ, α) = Λ(µ − α) = Λ(α − µ), and make the change of variables µ(ν) → µ(ν) − α(β) to perform the integrals over the new variables. Here O(t) 0 := αβ w αβ O αβ e −i(Eα−E β )t is the free evolution of the observable under the Hamiltonian H 0 . The second term may be re-expressed by definingμ = µ − α, to obtain, where E m := Noting that at t = 0 we by definition have ∆O(0) : , we see that the equality of the time and microcanonical averages is derived from our RMT approach. Thus, using the definition in Eq. (31), we obtain This is the same as that obtained in Reference [41] for pure-states. The approach outlined above is valid assuming that the decay rate Γ is constant in energy. In fact, for the system under consideration, we have to allow Γ to change with the initial state energy, such that Γ → Γ α (note that here the change in DOS does not affect the calculation). Accounting for this, rather than Eq. (34), we obtain Now, for our system we have that, as H S = 0, the microcanonical average [O αα ] µ = 0 for all µ in the bulk of the spectrum. Also, using that for our proposed experimental protocol, both the initial state and observable are diagonal in the non-interacting basis, we have We have that, as the initial state O(0) = α w α O αα = 1 for all initial device states, O αα = 1 for all non-zero w α , and thus We thus wish to obtain the value Γ that will be obtained when measuring the decay of an observable. To find this, one may simply consider the time integration of the evolution obtained above from the initial state, describing our probe-bath model, with an initial finite temperature bath state at inverse tempera-ture β. The time integration is then, where we have used in the second line that O DE = 0, and defined the thermal average · · · β at inverse temperature β, and we have defined Γ α as the decay rate of the initial state |φ α . We thus see that it is the thermal average of the inverse decay rate that is measured by a fit to the time dependence of an observable. The integral form of the thermal average of a function A(E), is given by, . We will see below that the FDT will be initially expressed in terms of a different average over the values Γ(E). This difference is resolved in Appendix E, where we re-express our FDT in terms of the thermal average above. We show that the form differs only by a constant that is independent of N and the coupling strength, and thus the scaling with Hilbert space dimension remains the same, and this difference is not important for our application.
Finally, we note that when Γ α ≈ Γ is approximately constant across the bulk of the initially populated initial states, the thermal average above is approximately equal to the unbiased average appearing in (10). We also see that in this case, from Eq. (40), one expects to observe an exponential decay at the rate Γ, as in Eq. (13). Indeed, this is what we observe in our numerical example in Section 3 above, and thus we are able to recover the Hilbert space dimension directly in Fig. 5. If a non-exponential decay is observed, then the average Γ(E) −1 β is obtainable via integration as above, and the scaling of the Hilbert space dimension is still obtainable as in Fig. 4. 6 Fluctuation-Dissipation Theorem

Derivation from RMT
Here we perform the full derivation of the FDT for the random matrix model described above. We initially focus on the case of a diagonal initial bath state ρ αβ = w α δ αβ . We then restrict the treatment to the specific protocol outlined in Section 2, where the initial state is the product of a single probe qubit in a pure state, and a bath in an infinite temperature state, see Eq. (9). We will follow a very similar steps as those outlined in the previous section, however we will see here that the correlation function calculation is somewhat more complicated.
The RMT model here is limited to the case of constant decay rate and DOS, we will thus extend the treatment to more realistic cases in the next section.
We are interested in the calculation of the longtime fluctuations, defined by the diagonal ensemble result, We begin be writing the initial density operator matrix elements as, then using Eqs. (44), and that |ψ µ = α c µ (α)|φ α , we may write the time fluctuations as, where coefficients of the initial state are labelled as unprimed indices α, β, and coefficients of the observable are labelled by primed indices. Using the self-averaging property of random matrices (see Appendices A and C), we may replace the product of coefficients c µ (α)c ν (α) · · · by their ensemble average c µ (α)c ν (α) · · · V ; the above expression may then be written in terms of a sum over 8-point correlation functions, weighted by the initial state and observable coefficients w α and O αα : Now, using the method of contractions outlined in Section 4.2, we see that this 8-point correlation function may be split up into to four-point correlation functions, each consisting of both Gaussian and non-Gaussian contractions. These are computed explicitly in Appendix B, in which we see that there are three dominating contributions to the fluctuations, given by, and, These three terms can be seen as the contributions to the 8-point correlation function arising due to products of Gaussian, non-Gaussian, and mixed Gaussian and non-Gaussian 4-point correlation functions respectively. In the above, in order to perform the summations over noninteracting indices in Eq. (47) we define course grained averages of observable elements O αα as in Eq. (23), as well as the mixed averages, We thus define W µ by with, We now take our bath to be in an initial infinite temperature state, such that . As such, W µ = W is in fact energy independent, as the probe Hamiltonian H S = 0, so microcanonical averages of probe observables are also energy independent. Now, as [w 2 α ] = 2[w α ] 2 , all terms in W are ∝ [w 2 α ], we define, where O ↑ = ↑ |O| ↑ , and we have used that We see that W O is a constant of the order of unity that depends only on the observable (e.g. W O = 3/2 for O = σ z ). Finally, taking the thermodynamic limit, such that (not that the diagonal terms in the summation can be seen to be negligible, as they contribute to a higher order in ω 0 Γ ), we have which may be evaluated using, where in the last line we have used that ∆E Γ, such that arccot 2Γ ∆E ≈ π 2 . We then obtain, where we have used that ∆E := 2N B ω 0 . We can see, then, that Eq. (57) is of the form of our main result, Eq. (3) of the main text, where C = W O 4π . What follows is to generalize this relation, allowing the DOS and Γ to vary in energy, and for finite temperatures.

Extension to Realistic Systems
The key issue with directly applying the RMT results to realistic models is that in general the DOS, and decay rate, are energy dependent, and thus change over the width of the initial state distribution (this is especially important for the high/infinite temperatures considered here). In order to account for this, we must then go back to the evaluation of the integrals over energy, in Eq. (55), and substitute Γ → Γ(E), and ω −1 0 → D(E). This is justified under the assumption that neither Γ(E), nor D(E), vary appreciably over the width Γ. i.e. Γ(E)−Γ(E+Γ(E)) Γ(E)

1.
We see then, that the integral in Eq. (55) is now where we have used that, . This can be seen to be valid as long as the above conditions on D(E) and Γ(E) hold, that is, as long as they change sufficiently slowly over the energy width Γ. The contribution of each eigenstate Λ(µ, α) to the fluctuations at a given energy is then that of a local (in energy) random matrix model, with a constant DOS and decay rate. D(E) and Γ(E) can then be allowed to change over an energy much wider than a the width of Λ(µ, α), as over such energy widths the contributions of relevant eigenstates are independent. Now, we further define ω = E µ − E ν , and make the change of variables E µ , E ν → ω, E, and thus obtain where in the second line we have assumed that D(E) and Γ(E) is approximately constant over the width Γ. Now, we define the unbiased average of a function A(E) as, (not to be confused with the average [· · · ] µ above) and see that, noting ∆E = 2N B

D(E)
, where C = W O 4π depends only on the choice of observable. We note that for the random matrix model, as the DOS and Γ(E) are both constant in energy, the average Γ(E) −1 is equal to the thermal average Γ(E) −1 β=0 obtained from a fit to the decay of an observable (see Section 5 above). In the case above, however, where the DOS and Γ(E) change in energy, the unbiased average decay rate is not necessarily the same as that obtained from a fit to the decay. We may fix this problem directly, as we do in the last section, where we see that the unbiased thermal averages may be replaced by regular thermal averages weighted by the DOS at the expense of a constant that depends on the functional form of D(E) and Γ(E) (but importantly, not on N , or the coupling strength). We can also see, that if Γ(E) is approximately constant over the width of the DOS, which is often the case in such systems (in fact from Eq. (40) this can be seen to be the case if an exponential decay of the observable is observed), then the biased and unbiased thermal averages of Γ(E) −1 are approximately equal for β → 0, and Eq. (62) may be directly experimentally confirmed as in Fig. 5.

Finite Temperature FDT
In this section we extend the above approach to finite temperature initial bath states, where the initial state is described by where the joint probe-bath Hamiltonian eigenbasis is built by ordering product states such that (α even). In this case, we have where Z β = α e −βEα δ α,odd , when the bath is initially a finite temperature state at inverse temperature β = 1 k B T , and the probe qubit is initially in state | ↑ . We thus obtain for the microcanonical averages of w α , assuming that β −1 Γ, and such that [w 2 α ] µ = 2[w 2 α ] µ . Now, our most general form for the long-time fluctuations (which assumes only the ability to define the required microcanonical averages that vary smoothly over a width Γ) is where E µ := Eµ+Eν 2 , and W µ is written in Eq. (53). Indeed, noting that the mixed average and that as each term in W µ is ∝ [w α ] 2 µ (see Eq. (53)), we may define This may be evaluated, including a variable DOS D(E), as in the main text for the infinite temperature case, via where in the second line we have made the change and ω = E µ − E ν , and used that ∆E Γ as in the main text. We now define the unbiased thermal average of the function A(E) as, Noting, then, that lim β→0 Z β = N B , and , we recover the infinite temperature case as required, We note that, unlike in the RMT case above, the average Γ(E) −1 β is not equal to the thermal average Γ(E) −1 β , which is that obtained from a fit to the decay. In Appendix E we show how the FDT may be defined in terms of this thermal average. Importantly for our proposed application, we obtain that the FDT in this form is related simply by a constant C β , defined in Appendix E, that does not depend on the the size of the device or coupling strength (within the weak coupling regime). For infinite temperatures we thus have Therefore, we can directly relate the measured inverse decay rate Γ(E) −1 0 to the time-averaged fluctuations, and from measurement of each for changing device size or coupling strength, as shown in the numerical experiments of Section 3, yields information on the scaling of the Hilbert space dimension. We finally note that it is simply the lack of direct knowledge of the constant C β which prevents measurements where there is a non-negligible change in the decay rate with energy (a non-exponential decay to equilibrium) from constituting a direct measurement of the value of the Hilbert space dimension. This constant depends on the functional form D(E) and Γ(E), and thus, if these are known, inference of the Hilbert space dimension itself is thus obtainable.
Finally, we note that the finite temperature approach above can be extended to the low temperature regime, as shown in Appendix E, from which we can recover the pure state FDT found in Ref. [41] in the low temperature limit.

Discussion
The results shown above demonstrate how the chaotic dynamics of thermalization may be exploited in order to gain information on the complexity of the unitary quantum dynamics of a system. We have proposed an experimentally viable protocol, by which measurements of a local observable of a probe qubit may be exploited to measure the Hilbert space dimension of an ergodic quantum device, initialized in an infinite temperature state. We note that this measures the dimension of the states directly involved in dynamics only, and thus provides a more accurate measure of the complexity of the dynamics than a simple estimate of the Hilbert space dimension from the number of qubits. In this sense, such a measurement of a large enough quantum device, if shown to be ergodic in the sense outlined above, would be a convincing indicator of the so called 'quantum supremacy' of the quantum device.
On a practical level, for a generic nonintegrable Hamiltonian, our results may be observed in two ways: measurement of a probe observable for (i) changing the number of qubits/ions/... in the quantum device (as in Figs.  3 and 4) , or (ii) changing the probe-bath coupling (as in Fig. 2). The latter is perhaps the simplest experimental methodology, which we show can confirm the ergodic behaviour of a system, that is, that the unitary dynamics requires an extensive proportion of the Hilbert space, by showing a linear relationship between the long-time fluctuations and decay rate. For a model where the device size may be altered, our FDT provides even deeper insight, allowing also for the experimental observation of the scaling of the Hilbert space dimension with system size.
For cases where an exponential decay to equilibrium is observed, which we show implies that the decay rate is constant over a large range of energies, our method allows the experimenter to access the numerical value of the Hilbert space dimension itself, not simply its scaling with size or coupling strength. This can be obtained from a single time trace of the decay to equilibrium of the observable, from the measurement of the decay rate, and fluctuations around equilibrium.

A.1 Random Matrix Model
The random matrix model under study may be expressed by a non-interacting part where E α = αω 0 , and ω 0 = 1/N is the spacing between energy levels, and perturbation term, modelled by a random matrix, where h αβ are independent random numbers selected from the Gaussian Orthogonal Ensemble (GOE), and rescaled by a coupling strength g, such that the matrix h has the probability distribution, giving h αβ = 0, and h 2 αβ = g 2 /N for α = β, and otherwise h 2 αα = 2g 2 /N . In Ref. [40] the current authors developed a consistent theoretical model of random wave functions |ψ µ = α c µ (α)|φ α , for the random matrix model above. We make the ansatz on the probability distribution on the c µ (α)s, where the δ-function term explicitly accounts for the orthogonality of the many-body eigenstates (which we showed to be necessary in order to obtain a consistent form for the off-diagonal matrix elements). This distribution Λ(µ, α) can then be shown to be a Lorentzian of width Γ = πg 2 N ω 0 [40,45]. From Eq. (78), one can calculate arbitrary correlation functions of the c µ (α) coefficient by first defining the generating function, where in the second line we have re-expressed the δ-functions in their Fourier form. The superscript (od) indicates that this is the 'off-diagonal' generating function, requiring µ = ν. The diagonal case is discussed below. The correlation functions may then be calculated by performing successive derivatives with respect to the force terms ξ via In particular, the correlation function c µ (α 0 )c ν (β 0 )c µ (α)c ν (β) V was found in [40] for µ = ν to be equal to with where the superscript (n) is left out for n = 1. The latter two terms in Eq. (81) arise as an explicit result of the orthogonality factor in Eq. (78). We stress here that the generating function Eq. (79) explicitly requires µ = ν, as it models the interactions due to mutual orthogonality of two random wave functions. For the diagonal part, we have the much simpler generating function, Thus, we have, from successive derivatives with respect to the force term ξ µ,α with different noninteracting indices, and taking the limit as ξ µ,α → 0, as in Eq. (80), for the diagonal case. Here g = 0.04, so Γ ∼ 0.007, and thus the high temperature limit β Γ −1 is approximately β 125. We thus observe that the finite temperature result Eqs. (62) and (115) (which we note are equivalent, the latter is used here), is fulfilled for all temperatures. We note that the low temperature limit above uses ρ B ∼ e −β(E−E0) , with E 0 = Emax 2 , to ensure that the initial state is not simply the ground state. For this limit we also use W O = [∆O 2 ], as discussed in the final section. Simulations are performed with a single realization of the random matrix V , and thus we observe directly the self-averaging property.

A.2 Self-Averaging
A final property we exploit is that of the self-averaging of random matrices, specifically, in summations we make the assumption that the observable time dependence and fluctuations are equal to the ensemble average over perturbations V , such that O(t) ≈ O(t) V . This has been previously checked for this model numerically in Refs. [40,41], and can also be seen to be valid numerically in this case from Figs. 6 and 7, which compare our analytical calculations to single realizations of the random matrix Hamiltonian. Self-averaging is a common tool in RMT [47], and whilst examples of behaviour that violates a self-averaging assumption indeed exist [48,49], these are generally due to additional constraints preventing such statistical behaviour, for global observables such as the survival probability, or in more exotic regimes e.g. close to critical points [50,51].
A recent and timely study [48] has looked in detail at the self-averaging behaviour of multiple quantities for both a full GOE, and a realistic spin chain, finding that in general one should be wary of simply applying self-averaging. It is shown, for example, that the survival probability is not selfaveraging at any timescale. Indeed, the current authors have recently shown that this quantity cannot be expected to behave ergodically, and thus RMT should not apply to the survival probability [49]. In the light of such studies, as well as such counterexamples to self-averaging outlined above, it is important to further justify the use of this property in more detail.
The definition for self-averaging used in [48] is that the vanishing of the quantity as the system size increases. Indeed, a very similar quantity was bounded for this RMT model in Ref. [35]. In Appendix C, we will show that self-averaging in the sense of the vanishing of Eq. (85) indeed occurs for our model using the assumptions on observables outlined above. We further note that as the realistic Hamiltonians of interest are not necessarily disordered, in the sense that they themselves may have no random component, we do not require that they be self-averaging themselves (indeed this has no real meaning in this case). Rather, the assumption in our case is that when expressed in that non-interacting basis, a chaotic interaction Hamiltonian is suitably 'random', such that it resembles a single realization of the ensemble we describe above, which itself is self-averaging. In fact, it is not important that the entire interaction Hamiltonian resembles an element of this ensemble, we require instead that locally, for any energy E within the bulk, the interaction Hamiltonian resembles an element of the random matrix ensemble within a width ∼ Γ from E.

A.3 RMT Numerics
Here we confirm our analytical results with numerical calculations with the random matrix Hamiltonian. In particular, we show in Fig. 6a, that the infinite temperature fluctuation-dissipation theorem (FDT), is satisfied in this model. This is shown for two 'observables' of the RMT model, O odd and O sym , which are chosen to be diagonal in the non-interacting basis, with diagonal elements given by, for O odd , and for O sym . These observables are chosen, as in Refs. [40,41], as they resemble realistic observables, such as local Pauli operators, in the sense that they are well defined, sparse, and highly degenerate [29] in the non-interacting basis. For our RMT numerical calculations, we define the initial state as such that O(0) = 1. The energy shift E 0 is simply to avoid edge effects at lower temperatures, where a large fraction of the initial state population would otherwise be in the ground state. In Fig. 7 we plot the time dependence of the above observables for an infinite temperature initial state, and compare these to the observable time dependence of Eq. (37), derived below.
The high temperature limit, in which our FDT is derived, is defined by β −1 Γ. In the numerics, we use g = 0.05, and thus Γ ∼ 0.007, so the high temperature limit requires β 125. We show plots for β = 100 and β = 500 in Figs. 6b and 6c, respectively. For the parameters used these correspond to a high temperature, near the edge of the expected limit, and a low temperature initial state. We observe that the finite temperature form in fact works well for all β values. This is further discussed analytically in the final section below, where we see that the low temperature limit of our current approach is equal to the pure state result previously obtained in Ref. [41].

B Gaussian, Non-Gaussian, and Mixed Contractions
Here we calculate the contributions to the long-time fluctuations of Eq. (47). We see that the relevant correlation function is an 8-point correlation function, which we will show below may be split into three groups: products of Gaussian contractions, products of non-Gaussian contractions, and mixed products of two Gaussian and one non-Gaussian contraction.
An example of the first form, Gaussian contractions only, is Indeed, there are three such contractions, occurring each time between pairs of indices, that only contribute two kronecker-δ factors -∼ δ αβ δ α β , δ αα δ ββ , δ αβ δ βα . Other Gaussian contractions may be defined, but may be ignored due to a reduction in the number of summations. In a similar manner, we may define non-Gaussian contractions that do not reduce the number of summations at all, such that they contribute on the same order as the Gaussian contractions above. An example is, (91) It can be easily seen that there are three non-Gaussian contractions of this form, with the other two being defined by swapping pairs of primed and unprimed indices in turn. Finally, we see that mixed contractions may also contribute, for example, We can see that there are six terms of this form that contribute only one δ factor. These are δ αβ , δ αα , δ αβ , δ βα , δ ββ , δ α β . We thus obtain that for the contribution from Gaussian contractions, δ 2 G (∞), Similarly, for the non-Gaussian (δ 2 N G (∞)), and mixed (δ 2 M (∞)) contractions we have and respectively. In order to perform the summations over non-interacting indices in Eq. (47) we define course grained averages of observable elements O αα as in Eq. (23). The key assumption in writing (23) is that the average, changes slowly in energy E µ with respect to the width Γ. Similar averages over the initial state, or mixed averages must also be defined, such as Note that, [O αα ] µ can be interpreted as a microcanonical average of the observable O. Now, using Eqs. (23) and (97), we obtain, C Proof of Self-Averaging In addition to the numerical demonstration in Figs. 6 and 7, we are now able to analytically demonstrate that self-averaging, in the sense of Eq. (85), occurs for this model. We have already obtained O(t) V in Section 5 (note that here we assumed self-averaging, and thus simply used the notation O(t) , in the current section we require the self-averaging step to be explicit, so use the angle brackets · · · V whenever self-averaging is performed), and thus, what remains is a calculation of O(t) 2 V . We can do this by writing, from which, we see that the relevant correlation function after self-averaging is c µ (α)c ν (β)c µ (α )c ν (α + n)c µ (α 1 )c ν (β 1 )c µ (α 1 )c ν (α 1 + n 1 ) V . Indeed, the contribution from Gaussian contractions can easily be seen to be identical to that of ∆O(t) 2 V , as these correlators only contract identical interacting indices µ, ν. Furthermore, we see that when non-Gaussian contractions are defined between primed and unprimed interacting indices, µ, ν, there is a reduction in the number of summations. For example, we compare the contribution µνµ ν αβα α 1 β 1 α 1 to that with mixed interacting indices, and observe that the latter contribution is negligible, due to the extra reduced summation, as there are no repeated non-interacting indices within the non-Gaussian contraction. We note that the dominating contribution from non-Gaussian contractions is inevitably from the n, n 1 = 0 term shown above (as this reduced summation does not change the order of the contribution as N O N ). Thus we have a single dominating contribution given by the 4-point correlation functions defined with primed, or unprimed, interaction indices only: Therefore, we can see that the transient component ∆O(t) of the time evolution is indeed self-averaging. Now, we can similarly see that the long-time average is self-averaging via, The ensemble average of this is then, where one can easily see that the terms with n = 0 require an additional contraction, and thus the contribution is on a lower order, with fewer summations over the non-interacting indices. We then see that, which, after self-averaging becomes, where we see once more that the n, n = 0 terms dominate, and that contractions between the primed and unprimed indices induce additional kronecker-delta factors, and thus are negligible. Thus, we have that R = 0 (see Eq. (85)), and we analytically observe self-averaging in this model. We note here that in the derivation of the 4-point correlator, we assume that the only 'interactions' between the eigenstates themselves are those enforced due to the mutual orthogonality of eigenstates via the condition α c µ (α)c ν (α) = δ µν [40]. As these interactions occur pairwise between eigenstates, correlation functions with more than two interacting indices µ, ν contribute an additional δ µν .

D Bound of dynamical term
In this section we bound third term in Eq. (33), showing that it's contribution is negligible. To do so, we proceed by defining, We may now use the relation | i a i | ≤ i |a i |, to write where we have used that, µν µ =ν Λ(µ, α)Λ(µ, β)Λ(ν, α)Λ(ν, β) Now, we see that in the case of w αβ ∼ δ αβ , the bound simply becomes similarly, this is the case for our application studied in the main text, where O αβ ∼ δ αβ . We note that the condition of diagonal w αβ correspond to a reasonable initial state in many experimental situations, since thermalization takes place typically by incoherent exchange of energy in the basis of eigenstates of H 0 . For states with coherences, to bound this quantity we require that the coherences are not large on the off-diagonals defined by α, α + n, in the sense that α w α,α+n O(1), so In this sense, 'special' initial states may be chosen that do not satisfy this bound, but they are highly atypical. Further, we note that the stronger of the two assumptions made on the form of observables, that is that we may write αβ O αβ = α N O n O α,α+n δ β,α+n , is only used above in bounding |A(t)|, the other terms are more general. As noted above, whilst this form is reasonable for generic local variables [41], one may of course be able to build observables that do not match this form, in which case, one may either have an additional contribution due to A(t), or be able to arrive at a similar bound.

E FDT in terms of Thermal Averages
In this section, we show that the FDT may be recast in terms of the thermal averages · · · β , obtained from the a fit to the decay, shown in Sec 5. We begin by re-expressing our finite temperature FDT in terms of the thermal averages, with Z β := ∆E 0 dED(E)e −β(E−E 0 ) , where we have introduced the low-energy cut-off E 0 , which is the energy of our zero temperature pure state. The addition of this cut-off ensures that, for non-zero E 0 , the initial state is not the ground state of the bath at zero temperature.
To include this thermal average, we return to Eq. (70), and note that, We thus obtain, We note that this can be put in the same form as the infinite temperature case, δ 2 O (∞) ∼ Γ −1 , by noting that the quantity depends only on the particular forms of the functions D(E) and Γ(E), and the temperature -importantly, not on N , or on the system-bath coupling strengths (for weak couplings), as both the numerator and denominator share the same dependence in N and the system-bath coupling in the thermodynamic limit. As such, we can write and thus we recover the form of our main result, Eq. (3) of the main text, with χ(N ) = C β For the random matrix case, where D(E) −1 = ω 0 , and Γ(E) = Γ is constant, we also have, as in this case, the thermal average · · · β and unbiased thermal average · · · β , can be seen to be equal.
We can check that, as required, one obtains the infinite temperature limit derived above by sending β → 0 by noting that for the infinite temperature case, we have which, in terms of the infinite temperature thermal average, may be written (noting that lim β→0 Z β = 2N B ), We finally note here that it is the measurement of a thermal average of the decay rate, and not the unbiased average, that obscures the direct measurement of N B , such that in general a FDT of the form of Eq. (120) would be observed experimentally. We thus limit our claim in cases where the energy dependence of the decay rate is important (which, as can be seen from Eq. (40), is implied by a non-exponential decay to equilibrium, due to multiple contributing decay rates) to the experimental verification of the finite size scaling of the Hilbert space dimension, rather than the particular value of N B , without additional assumptions or the numerical calculation of C β .

E.1 Low Temperature FDT
We now turn to the low temperature limit of Eq. (67) for which we expect to obtain the same result as the pure state case of Ref. [41], given by, where D(E α 0 ) is the DOS at the initial state energy E α 0 , which is chosen to be in the bulk of the spectrum, and [∆O so and thus, which is the zero temperature limit, Eq. (121) when W 0 = [∆O 2 αα ]. Which can be seen to be the case for zero temperature as follows. Recalling that W 0 is defined by where, We see that in the zero temperature limit w α ∼ δ αα 0 , and thus, the averages in Eq. (97) contribute to a lower order as the number of summations is reduced for terms with, e.g. w α w β , over terms with, so, as required. We recall that until Eq. (67), no assumptions on the initial state or observable are made other than the ability to define the required microcanonical averages. As such, taking the low temperature limit at this point, as we have done above, does not contradict any assumptions made.