Learning t-doped stabilizer states

In this paper, we present a learning algorithm aimed at learning states obtained from computational basis states by Clifford circuits doped with a finite number $t$ of $T$-gates. The algorithm learns an exact tomographic description of $t$-doped stabilizer states in terms of Pauli observables. This is possible because such states are countable and form a discrete set. To tackle the problem, we introduce a novel algebraic framework for $t$-doped stabilizer states, which extends beyond $T$-gates and includes doping with any kind of local non-Clifford gate. The algorithm requires resources of complexity $\text{poly}(n,2^t)$ and exhibits an exponentially small probability of failure.


Introduction
The task of learning an unknown quantum state usually refers to the ability to build a classical representation of the state that is "close" enough, based on a specified measure of distance, to the unknown state.However, it is well known that learning an arbitrary many-body quantum state requires an exponential number of experiments due to the exponential growth of the Hilbert space with the number of particles.
Let us be more precise, let us consider the Hilbert space of n qubits and denote d ≡ 2 n its dimension.We can define the group of Pauli operators as the collection of all possible products of Pauli matrices applied to each qubit.Pauli tomography [1][2][3][4][5][6] serves as a prevalent and widely employed technique for learning a given state ψ.Specifically, Pauli tomography entails acquiring knowledge about the expectation values tr(P ψ) for every Pauli string P .To achieve a classi-cal reconstruction of the state vector ψ, a total of d 2 Pauli operators must be measured and it becomes evident that tomography is generally a resource-intensive task.The primary challenge in achieving efficiency in quantum state tomography arises from our limited knowledge of the state of interest.Nevertheless, there exist different classes of states, for which tomography can be performed with remarkable efficiency, e.g.Matrix Product States [2], quantum phase states [7], non-interacting fermionic states [8] and stabilizer states [9].The latter is obtained by the computational basis state |0⟩ ⊗n by the action of the Clifford group -the subgroup of the unitary group that sends Pauli operators in Pauli operators [9].Each stabilizer state |σ⟩ corresponds to a stabilizer group G σ , a subgroup of the Pauli group, comprising d mutually commuting Pauli operators P satisfying the condition P |σ⟩ = ± |σ⟩, i.e. elements of the stabilizer group G σ stabilize the stabilizer state |σ⟩.Thus, to find the tomographic description of σ, one needs to learn the stabilizer group G σ and the relative phases.Every stabilizer group G σ admits (more than) a set of generators g σ , i.e.G σ = ⟨g σ ⟩ with cardinality |g σ | = n.Thus, it is sufficient to find a set of n generators and phases to completely characterize σ.Montanaro [10] showed an algorithm that learns the tomographic decomposition of any stabilizer state with O(n) queries to |σ⟩, thus gaining an exponential speed-up for stabilizer states overall tomographic methods [11].
However, stabilizer states are not universal for quantum computation and cannot provide quantum speed-up of any kind [12].To make the set of stabilizer states universal for quantum computation, one must use unitaries lying outside the Clifford group [12].One common choice is to use the single qubit T -gate defined as T = diag(1, e iπ/4 ).

States obtained from |0⟩
⊗n by the action of Clifford unitaries plus t T -gates are usually referred to as t-doped stabilizer states [13,14].Whether Montanaro's procedure could be ex-tended to states beyond the stabilizer formalism, in particular to t-doped stabilizer states, has been a research question since then.There has been a noteworthy attempt in [15], in which the authors cleverly discovered an extension of the stabilizer formalism for those states obtained by a stabilizer state acting with a row of t T -gates followed by a Clifford circuit: T-depth 1 circuits.Such states can be learned by O(3 t n) query accesses and time O(n 3 + 3 t n).However, their protocol is limited only to those states achievable through a specific architecture of Clifford+T circuits.
In this paper, we finally show that it is possible to extend learning algorithms to general states obtained by Clifford+T unitaries.We propose an innovative algebraic framework for t-doped stabilizer states by employing concepts from stabilizer entropy [16].Building upon this newly established structure, we devise two algorithms that aim at learning an unknown t-doped stabilizer state exactly.The first one involves sampling from a distribution Ξ obtained by squaring the expectation values of Pauli operators, also known as characteristic distribution, that can be achieved by a Bell sampling on both the state and its conjugate in the computational basis [10].The second approach relaxes the requirement of accessing samples from the characteristic distribution.Instead, it relies on the sampling from a probability distribution Ξ obtained by Bell measurements on two copies of the state at a time, without the need to access its conjugate.A key factor of both algorithms above is that, being limited in learning a discrete set of states obtained by the action of Clifford+T circuits, they learn an exact tomographic description in terms of Pauli observables.
To enhance clarity, we summarize the results on the sample and computational complexity of both methods below.Since both algorithms have access to samples from a probability distribution on Pauli operators, Ξ and Ξ respectively, in what follows we split the sample complexity coming from sampling from such distributions from the sample complexity employed for additional Pauli measurements, which we refer to as measurement shots. ( Throughout the manuscript, we refer to the coset representatives h 1 , . . ., h k as bad-generators of the t-doped stabilizer state ψ t .Note that the number m of generators g i and the number k of bad-generators h i must obey 2 m (k + 1) = |S ψt |.Thanks to the structure of S ψt in Eq. (2), a tdoped stabilizer state can be expressed in the Pauli basis as where h 0 ≡ 1l.Notice that the operator m j=1 (1l + ϕ gj g j ) = 2 m Π ψt is proportional to the projector Π ψt onto the stabilizer group G ψt , which automatically implies that bad generators commute with every P ∈ G ψt to ensure hermiticity and positivity of ψ t .Through the decomposition in Eq. (3), we can easily derive the corollary of the results in [18,19] that will be useful later on.Define D ∈ C n a Clifford operator that obeys where 1l [n−m] denotes the identity on the last n − m qubits.Then since h i are coset representative and do commute with Π ψt , the following decomposition holds where |ϕ⟩ is a state defined on the last (n − m) qubits.The density matrix associated to |ϕ⟩ can be written as |ϕ⟩⟨ϕ| = 2 n−m k i=0 tr(h i ψ t ) hi , where hi ≡ Dh i D † .Moreover the following relation, useful for what follows, holds G DψtD † = DG ψt D † .To be consistent with the terminology used in Refs.[18,19], we refer to the Clifford operator D as the diagonalizer.
As another consequence of Eq. (3), a direct computation of the stabilizer entropy M α (ψ t ) returns M α (ψ t ) = E α (χ ψt ) − ν(ψ t ), where ν(ψ t ) ≡ n − m is the stabilizer nullity introduced in [20], while E α (χ ψt ) is the α-Rényi entropy of the probability distribution χ ψt with support on the cosets h i G ψt and components χ ψt While all the results presented so far are general, let us conclude the section by directing our attention toward t-doped stabilizer states resulting from the application of Clifford+T circuits.For T-gates, the previously derived bounds can be obtained by substituting 2l with 1. Indeed, l = 1 for single-qubit gates, and the additional 1/2 factor arises when considering diagonal gates, see [17].As these states form a discrete set, it is expected that the expectation values of Pauli operators exhibit discrete values.Hence, there should exist a finite resolution, strictly dependent on t, enabling one to exactly distinguish these values by employing a sufficient number of samples.This intuition is made rigorous in the following discussion.Consider a t-doped stabilizer state obtained by the action of t T -gates.Then, for every P, Q ∈ P n , the following lower bound holds provided that tr(P ψ t ) ̸ = tr(Qψ t ).See Appendix C.4 for the proof.Therefore, for t = O(log n), the expectation values of Pauli operators either coincide or exhibit a polynomial separation.This fact will enable us to say that by employing a sufficient number of samples -though polynomial in n for t = O(log n) -to measure Pauli expectation values, one can learn t-doped stabilizer states exactly.Let us remark again that, contrary to all statements presented in the section, Eq. ( 5) does not hold for the broader scenario of doping with O(1)-qubit non-Clifford gates; rather, it specifically applies to the case involving T -gates exclusively.

Learning algorithm from Ξ ψ tsamples
In the previous section, we explicitly revealed the algebraic structure of a generic t-doped stabilizer state.Let us provide the algorithm able to learn a t-doped stabilizer state.While the majority of our results can be generalized to states doped by l-local non-Clifford gates (with l = O(1)), we will focus our discussion on the T -gate case in order to employ the finite resolution result in Eq. (5).In particular, a state ψ t doped with a number t of T -gates is characterized by Therefore, the learning of a classical description of a t-doped stabilizer state requires the knowledge of 2m + 2k ≤ 2n + 2 × 4 t objects, and therefore for t = O(log n) we have 2m + 2k = O(poly(n)) 1 .
Let us assume to be able to sample from the distribution Ξ ψt .Later on, in Section 4, we discuss in which situations an efficient sampling from Ξ ψt is possible by querying the t-doped state ψ t .We first aim to learn a basis for the stabilizer group G ψt , i.e. m generators g i .Sampling from Ξ ψt we get a Pauli operator P ∈ G ψt with probability with failure probability at most 2 −n .Indeed, determining a basis for G ψt is equivalent to determining a basis for the m-dimensional subspace T of F 2n 2 ; the probability that n+m random samples from T lie in a (m−1)-dimensional subspace of T is 2 −m−n and -by the union bound -the probability that these samples lie in one of the 2 m (m − 1)dimensional subspaces of T is at most 2 −n .We conclude that O(2 t (n+m)) samples from the probability distribution Ξ ψt , plus O(2 t (n + m)M ) additional measurement shots, are sufficient to learn all the generators g 1 , . . ., g m and phases ϕ gi with failure probability O(2 At this point, let us learn the k bad generators h 1 , . . ., h k .Suppose the algorithm already found l out of k bad generators, say h 1 , . . ., h l .To find the (l + 1)-th bad generator, we keep sampling from Ξ ψt .The probability π l that the sampled Pauli operator P does not belong to To find a useful lower bound to π l , let us exploit the unit purity of the state ψ t .In Appendix C, we obtain with h min := min hi | tr(h i ψ t )|.However, a learner needs an efficient way to check whether a sampled Pauli operator belongs to Let us provide a simple algorithm to efficiently check whether a Pauli operator P belongs to Then for some i ∈ 0 . . .l, there exists h i such that h i P ∈ G ψt .Thus the problem reduces to check if there exists an h i ∈ {h 0 , . . ., h l } such that h i P ∈ G ψt .The latter can be solved by adding h i P to a generating set G ψt and performing Gaussian elimination over F 2n 2 .The task requires O(n 3 ) steps and has to be repeated l times, meaning that the full task requires O(n 3 l) computational steps, and so to learn all the k bad generators one needs O(n 3 k 2 ).In appendix B we provide a slightly more efficient algorithm relying instead on the notion of diagonalizer [18,19] where the number of computational steps to verify the containment of one bad generator is O(n 2 k), and so reducing the computational steps to learn all the bad generators to O(n 2 k 2 ).The total number of samples scales as Therefore a single query to |ψ t ⟩ and its conjugate |ψ * t ⟩ suffices to obtain one sample from the distribution Ξ ψt .
At this point, the careful reader may wonder how the learner can prepare the conjugate state |ψ * t ⟩ in order to use the algorithm.We have a manifold of answers to this.First of all, an important situation in which learning is important is that of the study of ground states of quantum many-body systems.An example is provided by stabilizer Hamiltonians [21] perturbed by local impurities, which exhibit t-doped stabilizer eigenstates as shown in [22].Generically, as long as time-reversal is not broken, these are real states, and thus |ψ * t ⟩ = |ψ t ⟩.Moreover, in many scenarios, the preparation itself of the state |ψ t ⟩ can be thought of as made by a quantum circuit.In this case then, the task of preparing |ψ * t ⟩ is straightforward.Indeed, it is sufficient to replace the gate S = diag(1, i) with S * = diag(1, −i) and the Tgate with T S * to obtain U * t and thus constructing |ψ * t ⟩ from |0⟩ ⊗n .
In any case, while distilling the conjugate state is a hard task in general, the states under examination in this paper for which the learning is efficient, i.e. t-doped stabilizer states with t = O(log n), have fine structure and hardness proofs fail at capturing the hardness of distilling |ψ * t ⟩ from samples of |ψ t ⟩.For example, the hardness proof in Ref. [23] uses pseudorandom quantum states and it is known that pseudorandom quantum states contain ω(log n) many T -gates [24].Indeed, from the knowledge of the algebraic structure in Eq. (3), sampling from the characteristic distribution is straightforward: it is sufficient to flip a k-faceted dice with probabilities χ ψt (h i ) -defined above in Section 2 -and outcomes h i , and uniformly sample a Pauli operator P ∈ G ψt through m generators.As a result, one samples a Pauli operator h i P according to Ξ ψt .This procedure is readily efficient for k = O(poly(n)), i.e. for t = O(log(n)).
Therefore, for t = O(log n), there is no reason to believe that sampling from the distribution Ξ ψt , with queries limited to |ψ t ⟩ alone and without knowing the algebraic structure (3) (i.e.bypassing the a priori learning of the t-doped stabilizer state), is hard and, as such, it remains an ongoing subject of research.Remarkably, in Ref. [25], an algorithm is presented capable of sampling from the characteristic distribution that includes, but is not limited to, states with t = O(log n) and low entanglement.

Learning algorithm without access to Ξ ψ t
We discussed a simple and intuitive learning algorithm that works whenever one has access to samples from the distribution Ξ ψt .However, as we saw in Section 4, known methods for sampling from Ξ ψt require query access to the t-doped stabilizer state and its conjugate in the computational basis, which is not always available in real scenarios.In this section, we present an algorithm similar in spirit to the one in Section 3, but we relax the hypothesis of having samples from Ξ ψt .
As measuring |ψ t ⊗ ψ * t ⟩ in the Bell basis |P ⟩ returns samples from Ξ ψt , measuring |ψ t ⊗ ψ t ⟩ in the Bell basis returns samples from Let us show that samples from Ξ ψt are still sufficient to learn the stabilizer group G ψt and, ultimately, the t-doped stabilizer state ψ t exactly.
To see this, it is useful to show how the algebraic structure changes from ψ t to ψ * t .Naively, from Eq. (3), we can write the density matrix associated to ψ * t as To understand the algebraic structure of ψ * t , it is useful to look back at the simple case of stabilizer states, i.e. t = 0.As shown by Montanaro in Ref. [10], given a stabilizer state |σ⟩, its conjugate |σ * ⟩ can be obtained with a Pauli rotation as |σ * ⟩ = P σ |σ⟩ for some Pauli operator P σ .Similarly, since the projector Π ψt in Eq. (3), onto the stabilizer group G ψt , can be completed to represent a pure stabilizer state (by completion of the stabilizer group G ψt ), it is immediate to see that there exist a Pauli operator P ψt such that (see Appendix D) where where the proof is in Appendix D. Therefore, repeating the reasoning outlined in Section 3 after Eq. ( 6), we conclude that O(2 6t (n + m)) samples from Ξ ψt , plus O(2 9t n(n+6t)) additional measurement shots, are sufficient to learn all the generators g 1 , . . ., g m and phases ϕ gi with exponentially small probability of failure O(n2 −n ).
Having learned the generators g i and relative phases ϕ gi , it is thus possible, through standard tableau manipulation techniques (see Ref. [18]), to construct and distill the diagonalizer D in Eq. (4) using O(n 2 ) computational steps and up to O(n) Clifford gates [26].Applying D on the t-doped stabilizer state |ψ t ⟩, one gets D |ψ t ⟩ = |0⟩ ⊗m ⊗ |ϕ⟩.At this point, one is left to learn the tomographic description of |ϕ⟩ in terms of Pauli operators hi ≡ Dh i D † .To achieve this task, it is sufficient to individually measure each Pauli operator hi for i = 1, . . ., k.In virtue of Eq. ( 5), to estimate the expectation values tr(h i ψ t ) exactly, O(2 2bt ) samples are sufficient, and thus to learn the expectation values of k bad generators h i , . . ., h k one needs O(4 t 2 2bt ) total measurement shots.In summary, the above algorithm learns an exact tomographic description of a t-doped stabilizer state using O(2 6t n) samples from the distribution Ξ (which in turn can be obtained by measuring 2 copies of ψ t in the basis |P ⟩), O(2 9t n(n + 6t)) measurement shots, O(n 2 ) additional computational steps, and fails with probability O(n2 −n ).
The careful reader might have noticed similarities between the procedure outlined above and the subroutine known as Bell difference sampling [27], where one considers two samples from the distribution Ξ ψt and, subtracting the results, obtain sample Pauli operators P ∈ G ⊥ ψt = {P ∈ P | [P, G ψt ] = 0} whose knowledge allows to reconstruct the stabilizer group G ψt , as shown in [28].There are two key insights here: first, the derivation outlined above solely descends from the algebraic structure of t-doped stabilizer states developed in this paper that provide a simple proof of the results.Secondly, the algorithm outlined above and the Bell difference sampling routine exhibit similarities, yet they are not entirely identical.Specifically, the above algorithm exclusively accommodates pairs of Pauli operators, denoted as P and P ′ , whose product resides within G ψt ⊂ G ⊥ ψt .Hence, it does not need any classical postprocessing.This fact allows for the complete and precise determination of the stabilizer group G ψt with overwhelming probability, albeit at the expense of an exponential increase in computational cost with respect to t, a scaling that Bell difference sampling does not possess.

Concurrent work
During the finalization of this manuscript, we became aware of two works [28,29] sharing several similarities with ours, presenting an efficient approach for learning t-doped stabilizer states.This section aims to compare algorithms.The algorithms presented separately in [28] and [29] are quite similar.Hence, we focus on comparing the one from [28] with ours for simplicity.
The algorithm in [28] has a significant advantage: it uses Bell difference sampling to approximate the stabilizer group G ψt associated with the t-doped stabilizer state |ψ t ⟩ using O(n) resources, avoiding exponential scaling with the number t of non-Clifford gates.However, it only learns an approximate version of the stabilizer group, which might not suit cases where an exact description of a t-doped stabilizer state is required.For T-gates, exact learning is desirable since these states are countable and form a discrete set.Whether the algorithm [28] can be employed to learn the exact stabilizer group in light of the finite resolution proven in Eq. ( 5) is not clear.The algorithm presented in Section 5 of this paper, leveraging the algebraic structure introduced here, can precisely learn the stabilizer group G ψt from copies of |ψ t ⟩, with an exponentially small probability of failure.However, this precise learning comes at the cost of exponential resource scaling with the number of Tgates.Compared to the algorithm in [28], our algorithm's primary strength lies in its exact learning ability for states created by Clifford+T circuits, enabled by Eq. (5).Both algorithms in Section 3 and 5, with different methods, whose limitations have been explored above, achieve the exact learning of the algebraic structure of t-doped stabilizer state displayed in Eq. (3).However, it's important to note that these states are not the only ones with polynomially many bad generators, and therefore our algorithms do not cover the larger class of states learnable according to the algorithm in Ref. [28], which, in this regard, is more general.Overall, we conclude that the two algorithms exhibit distinct ranges of applicability and employ different techniques to achieve the same task, each with its own set of strengths and limitations.

Conclusions
In conclusion, we presented an efficient and general algorithm that can learn a classical description of states obtained from the computational basis by the action of the Clifford group plus a few T gates.To address this challenge, we introduced a new algebraic structure for t-doped stabilizer states that is of independent interest and plays a crucial role in the learning algorithm.
In perspective, there are several open questions.First, we ask whether the algebraic structure presented in this paper can be instrumental in solving tasks beyond quantum state tomography for tdoped stabilizer state, e.g. in computing quantity as entanglement or magic.Second, we would like to know whether the knowledge of both the generators and the bad generators can be utilized to synthesize a Clifford+T circuit able to construct the state |ψ t ⟩ from |0⟩ ⊗n .Indeed, note that the task is conceivable for two reasons: 1) because the complexity (i.e. the number of gates) of Clifford+T circuits scales as O(n 2 + t 3 ) 2 as a consequence 2 By Theorem 2 of [18] a Clifford+T circuit Ct with of the results of [18,19,26]; 2) given a t-doped stabilizer state, one can learn exactly the Clifford circuit D, reducing the task of learning the circuit description to u t only.Lastly, the estimated resources are computed for the worst case.We ask whether the average case complexity for the exact learning of t-doped stabilizer states doped with Tgates is more favorable in terms of the exponential scaling in t.
t ≤ n has the following decomposition Ct = C 1 (1l ⊗ ut)C 2 , with C 1 , C 2 ∈ Cn and ut a t-doped Clifford circuit on tqubits consisting of at most t many T gates.From [26], any Clifford circuit can be implemented with at most O(n 2 ) gates.Thus, #(C i ) = O(n 2 ) for i, 1, 2 and #(ut) = O(t 3 ), and consequently the total complexity is O(n 2 + t 3 ) "Bell sampling from quantum circuits" (2023).arxiv:2306.00083.

A Technical preliminaries
In this section, we will introduce some technical preliminaries that may prove beneficial to the reader.
A.1 Pauli Group, Clifford group, and stabilizer states The Pauli group P n is the n-tensor fold product of the single-qubit Pauli group P 1 , which consists of the elements 1l, X, Y, Z, each multiplied by a scaling factor of ±1, ±i, where X, Y and Z are the standard single-qubit Pauli matrices.Let us introduce the quotient group P n of the Pauli group P := P/{±1, ±i} .In the main text, we refer to the quotient group as Pauli group.The Clifford group C(n) on n qubit, i.e. a subgroup of the unitary group is defined as the normalizer of the Pauli group P n : With the notion of Pauli and Clifford group, one can then define the notion of a stabilizer state.Given a Pauli operator P ∈ P n , we say that P stabilizes the state |ψ⟩ if A state |σ⟩ is a stabilizer states if |σ⟩ is stabilized by an group G of d commuting Pauli operators Equivalently, one can define a stabilizer state as |σ⟩ ≡ C |0⟩ ⊗n with C ∈ C n .It is noteworthy to notice that a stabilizer state can be written as an equal superposition of Pauli operators that stabilizes it σ = 1 d where σ is the density matrix associated to |σ⟩ and σ p = ±1 is the phase associated to P .Being G an Abelian group, there exists a set of generators S ≡ {g 1 , . . ., g n } such that G = ⟨S⟩.Thus, given S the set of generators associate with G the state σ can be expressed in terms of the generator g i ∈ S as where ϕ gi are the ±1 phases associated with the generators g i s.

A.2 Bell Sampling
In this section, we will review the concept of Bell sampling [10]  The potential failure of this process occurs when all the K samples lie in a subspace of T with dimension less than n.The probability that all the obtained samples fall within such a subspace is 2 −K .By applying the union bound, we find that the probability of these samples lying in one of the 2 n (n − 1)-dimensional subspaces of T is 2 −K+n .To mitigate this probability of failure, one can choose K = 2n, resulting in an exponentially vanishing probability of failure.With the set of generators for G σ obtained, one can learn the phases by making a single-shot measurement of the expectation value of the learned Pauli generators.
In cases where access is limited to |σ⟩⊗|σ⟩, rather than |σ * ⟩⊗|σ⟩, it is still possible to learn the stabilizer state through a similar procedure.First, it is worth noting that for a stabilizer state |σ * ⟩ = P |σ⟩, where P is a Pauli string consisting of Z, 1l operators.Consequently, the probability of sampling the Pauli operator P can be rewritten as | ⟨σ|P P |σ⟩ | 2 /2 n .Although P does not stabilize |σ⟩, the Pauli operator P P does it; otherwise, the sampling probability would be zero.Thus, when two Pauli operators, P 1 and P 2 , are sampled, their product stabilizes |σ⟩.This can be demonstrated straightforwardly by noting that both P 1 P and P 2 P stabilize |σ⟩, and their product, P 1 P P 2 P , remains a Pauli operator that stabilizes |σ⟩.Furthermore, it is proportional to P 1 P 2 , with the proportionality factor possibly being ±1 due to the commutation relations between P and P 2 .
Thanks to this property, instead of sampling K times, one only needs to sample K + 1 times to determine the set of generators for G σ , while maintaining the same failure probability.

B Efficient verification of containment
In this section, we provide an alternative algorithm based on the notion of diagonalizer to efficiently verify the containment of a Pauli operator P to the set G ψt ∪ h 1 G ψt ∪ . . .∪ h l G ψt .Let Z n ⊂ P n the subgroup of the Pauli group generated by {σ z i } n i=1 , with σ z i = diag(1, −1) on the i-th qubit.Then let D ∈ C n be a Clifford circuit such that Gψt ≡ DG ψt D † ⊂ Z n .Notice that D can be found from a set of generators of G ψt in time O(n 2 ) [18,19].Define hi ≡ Dh i D † , which requires computational complexity O(n 2 l).To check whether P ̸ ∈ G ψt ∪ h 1 G ψt ∪ . . .∪ h k G ψt is sufficient to check that DP D † ̸ ∈ Gψt ∪ hi Gψt ∪ . . .∪ hl Gψt that -thanks to the local support of σ z i s -can be easily achieved by a qubit by qubit checking in time O(nl).Consequently, the final runtime to verify if a Pauli operator P belongs to G ψt ∪h 1 G ψt ∪. ..∪h l G ψt is O(n 2 k).

C Proofs C.1 Disjoint cosets.
In this section, we will prove the disjointedness of the cosets.Let us consider P ∈ S ψt such that P ̸ ∈ G ψt , then it is easy to notice that tr(P gψ t ) = ± tr(P ψ t ), and so P g ∈ S ψt ∀g ∈ G ψt .To prove disjointedness let us consider two Pauli operators P 1 , P 2 ∈ S ψt such that P 1 , P 2 ̸ ∈ G ψt , then if P 1 G ψt ∩ P 2 G ψt ̸ = 0 would imply that ∃g 1 , g 2 ∈ G ψt such that P 1 g 1 = P 2 g 2 .However, we would get that P 1 ∈ P 2 G ψt , thanks to the group property of G ψt , and thus P 1 G ψt ≡ P 2 G ψt .
The last step before proving the statement comes from the result of Lemma 1 that ensures the existence of a Pauli operator P such that | ⟨ϕ| P |ϕ * ⟩ | ≥ 2 −(n−m) .Therefore, let us prove the statement, i.e. that the probability that Bell measurements sample a Pauli P such that P P ψt D † P D ∈ G ψt is greater than (57) For fixed P ψt and P , one has that the probability that two samples Pauli operators P, P ′ obeys the property in Eq. (57) is 1 2 6t , and one has that the probability that the product P P ′ ∈ G ψt is, by a simple shift for P ψt D † P D, is the one in Eq. (50).
Algorithm 2. Having access to sample fromΞψt = d −1 |⟨ψ t |P|ψ * t ⟩| 2 , the algorithm that learns a exact classical description of t-doped stabilizer state requires O(2 [16]te P n the Pauli group on n qubits and C n the Clifford group.We define t-doped Clifford unitaries, denoted as C t , unitary operators comprised of Clifford unitaries C ∈ C n plus t l-qubits non-Clifford gates with l = O(1).Let |ψ t ⟩ be a t-doped stabilizer state -i.e. a state obtained as|ψ t ⟩ = C t |0⟩⊗n -and let ψ t be its density matrix.Define S ψt := {P ∈ P n | tr(P ψ t ) ̸ = 0} the set of Pauli operators with nonzero expectation over |ψ t ⟩.For a stabilizer state (t = 0) |S ψ0 | = d.Further, one can define the probability distribution Ξ ψt with support on S ψt and components Ξ ψt (P ) = tr 2 (P ψ t )/d.We denote the stabilizer entropy[16]of the state ψ t as M α (ψ t ) = S α (Ξ ψt )−n, that corresponds to the Rényi entropy S α (Ξ ψt ) of order α of the probability distribution Ξ ψ up to an offset n.As we shall see, the stabilizer entropy can be operationally interpreted as the entropy of the tomography in the Pauli basis.≡ log 2 |G ψt |, such that every element Q ∈ G ψt can be obtained by a product of g 1 , . . ., g m with relative phases ϕ g1 , . . ., ϕ gm defined by g i |ψ t ⟩ = ϕ gi |ψ t ⟩.Since there are at most |S ψt | ≤ d2 2lt Pauli operators with nonzero expectation over ψ t and at least d/2 2lt of those stabilize |ψ t ⟩, there exist Pauli operators h 1 , . . ., h k ∈ S ψt ( k ≤ 4 2lt −1) such that the set S ψt can be written as the union of left disjoint cosets (see Appendix C.1)

)
[10]efore, after O(2 t ) samples one gets a Pauli operator sampled uniformly from G ψt with high probability.To check whether a sampled Pauli operator P stabilizes |ψ t ⟩ is sufficient to measure M times P .If all the measurement outcomes are equal, then P ∈ G ψt with failure probability at most(1 − 2 −t )(h max /2 + 1/2) M (see Appendix C),where h max ≡ max hi | tr(h i ψ t )|.We remark that the above procedure also reveals the stabilizing phase ϕ P corresponding to P ∈ G ψt .Following the reasoning introduced by Montanaro[10], the extraction of m + n random Pauli operators ∈ G ψt suffices to identify a set of generators g 1 , . . ., g m −2min (γ + log(k + 1)) for γ being the Euler's constant.Therefore using O 2 t h −2 min log(k + 1) samples from Ξ ψt , plus O(n 2 k 2 ) computational steps, one learns the k bad generators h 1 , ..., h k .To evaluate h min and h max , let us recall that the state |ψ t ⟩ is obtained from the computational basis state |0⟩⊗n by a Clifford circuit polluted with t single qubit T -gates, that generate a discrete set of states, which ultimately implies the existence of a finite resolution δ t ≡ min | tr[(P − Q)ψ t ]| for tr(P ψ t ) ̸ = tr(Qψ t ).From Eq. (5), one hasδ t = Ω(2 −bt ) for b ≃ 2.27.Therefore O(2 2bt) measurement shots are sufficient to determine the expectation value tr(h i ψ t ) exactly.In addition, one has h min = Ω(2 −bt ) and h max < 1 − O(2 −bt ).In summary, to learn k bad generators and the corresponding expectations, the learner employs O(2 (2b+1)t log(k + 1)) samples from Ξ ψt , O(k2 2bt ) measurement shots, and O(n 2 k 2 ) additional computational steps.Let us count the total number of resources to learn a t-doped stabilizer state employing the algorithm proposed above.Set m ≤ n, k < 4 t , b < 3 and M = 2 3t+1 (n + t).The total number of samples from the probability distribution Ξ ψt is O(2 t n + t2 5t ) plus O(4 t n 2 ) additional computational steps.The total number of measurement shots is O(2 7t + 2 4t n(n + t)) and the algorithm fails with probability at most O(n2 −n ).Therefore for t = O(log n), the algorithm learns the tomographic description of a t-doped stabilizer state of Eq. (3) with polynomial resources and overwhelming probability.In Section 3, we provided an algorithm that learns a t-doped stabilizer state with poly(n, 2 t ) resources.In particular, the algorithm uses O(2 t n + t2 5t ) samples from the probability distribution Ξ ψt with elements Ξ ψt (P ) = tr 2 (P ψ t )/d.Let us discuss the connection between queries to the unknown state |ψ t ⟩ and samples from the distribution Ξ ψt .Having query access to the t-doped stabilizer state |ψ t ⟩ and its conjugate (in the computational basis) |ψ * t ⟩, one can easily achieve the task via Bell sampling.Define the Bell basis on two copies of Hilbert space H as |P ⟩ ≡ 1l ⊗ P |1l⟩ with |1l⟩ = 2 −n/2 2 n i=1 |i⟩ ⊗ |i⟩ and P ∈ P n .Then, measuring |ψ t ⊗ ψ * t ⟩ in the basis |P ⟩ is equivalent to sample from the distribution Ξ ψt : Π * ψt ≡ 2 −m m j=1 (1l+ϕ gj g * j ).In other words, analogously to the case of stabilizer states, the projectors onto the stabilizer groups Π ψt , Π * ψt , of ψ t and ψ * t respectively, can be obtained from one another by the application of a Pauli operator P ψt .Hence, the state |ψ * ′ t ⟩ := P ψt |ψ * t ⟩ has the same stabilizer group G ψt of ψ t , by construction.As explained in Section 2, from the projector Π ψt onto the stabilizer group G ψt , one can define a diagonalizer D. Since the stabilizer groups are the same, D diagonalizes |ψ t ⟩ and |ψ * ′ t ⟩ at the same time as D |ψ t ⟩ = |0⟩ ⊗m ⊗|ϕ⟩ and D |ψ * ′ t ⟩ = |0⟩ ⊗m ⊗ |ϕ * ⟩, cfr.Eq. (4).By simple manipulations, we can thus rewrite the probability in Eq. (10) as only depending on the nonstabilizer part |ϕ⟩ of the tdoped stabilizer state |ψ t ⟩ as (13) P ≡ ⟨0 ⊗m |DP P ψt D † |0 ⊗m ⟩.It is easy to be convinced that the probability Ξψt (P ) is different from zero only if P P ψt ∈ D † (1l m ⊗ P )DG ψt for any P ∈ P n−m living on the last (n − m) qubits.Hence, samples from Ξ ψt (P ) returns Pauli operators P ∈ {D † (1l m ⊗ P )DG ψt | P ∈ P n−m } with probability(13).As in the case of stabilizer states, to gain knowledge of the stabilizer group G ψt , it is sufficient to obtain two Pauli operators P, P ′ belonging to the same left coset P, P ′ ∈ D † (1l m ⊗ P )DG ψt , and consider the product P P ′ ∈ G ψt .At this point, the joint probability that two samples give a product Pauli operator belonging to G ψt is lower bounded by This set of states corresponds to the 2-qubit Bell basis.This equivalence between Pauli operators and states can be extended to the n-qubit case.The reason behind this extension lies in the fact that the n-qubit Pauli group is constructed as the n-fold tensor product of the 1-qubit Pauli group.Consequently, we can tensor 2-qubit states |P ⟩ to establish an equivalence between the n-qubit Pauli group and the 2n-qubit Bell basis.In this context, a generic basis element can be written as |P ⟩ ≡ 1l ⊗ P |1l⟩, where |1l⟩ denotes the 2n-qubit maximally entangled state, defined as |1l⟩ = 2 −n/2 d i |i⟩ ⊗ |i⟩.Let us consider the state |ψ * ⟩ ⊗ |ψ⟩, it is not difficult to show that measuring in the Bell basis return outcome P with probability tr(P ψ) 2 /d, with ψ the density matrix associated to the state ψ.The probability to obtain P is given by | ⟨P |ψ * ⊗ ψ⟩ | 2 | ⟨P |ψ * ⊗ ψ⟩ | 2 = ⟨1l|1l ⊗ P [ψ * ⊗ ψ]1l ⊗ P |1l⟩ where we used the properties of the maximally entangled state that O ⊗ 1l |1l⟩ = 1l ⊗ O T |1l⟩, and that the state ψ is pure.If instead of having access to |ψ * ⟩ ⊗ |ψ⟩ one has instead access to |ψ⟩ ⊗ |ψ⟩ the probability of sampling P is given by | ⟨ψ|P |ψ * ⟩ | 2 /2 n .Through Bell sampling, one can learn a stabilizer state |σ⟩ if provided access to |σ * ⟩ ⊗ |σ⟩.In this process, Bell basis measurements are performed K times, and the resulting measurement outcomes yield a set of strings denoted as T .This set T consists of 2n-dimensional strings, each identifying a Pauli operator P such that P ∈ G σ , where G σ is the stabilizer group of |σ⟩.Finding an n-dimensional basis for the T is equivalent to finding a set of generators for the stabilizer group G σ .