Improved upper bounds on the stabilizer rank of magic states

In this work we improve the runtime of recent classical algorithms for strong simulation of quantum circuits composed of Clifford and T gates. The improvement is obtained by establishing a new upper bound on the stabilizer rank of $m$ copies of the magic state $|T\rangle=\sqrt{2}^{-1}(|0\rangle+e^{i\pi/4}|1\rangle)$ in the limit of large $m$. In particular, we show that $|T\rangle^{\otimes m}$ can be exactly expressed as a superposition of at most $O(2^{\alpha m})$ stabilizer states, where $\alpha\leq 0.3963$, improving on the best previously known bound $\alpha \leq 0.463$. This furnishes, via known techniques, a classical algorithm which approximates output probabilities of an $n$-qubit Clifford + T circuit $U$ with $m$ uses of the T gate to within a given inverse polynomial relative error using a runtime $\mathrm{poly}(n,m)2^{\alpha m}$. We also provide improved upper bounds on the stabilizer rank of symmetric product states $|\psi\rangle^{\otimes m}$ more generally; as a consequence we obtain a strong simulation algorithm for circuits consisting of Clifford gates and $m$ instances of any (fixed) single-qubit $Z$-rotation gate with runtime $\text{poly}(n,m) 2^{m/2}$. We suggest a method to further improve the upper bounds by constructing linear codes with certain properties.

In this work we improve the runtime of recent classical algorithms for strong simulation of quantum circuits composed of Clifford and T gates. The improvement is obtained by establishing a new upper bound on the stabilizer rank of m copies of the magic state |T = √ 2 −1 (|0 + e iπ/4 |1 ) in the limit of large m.
In particular, we show that |T ⊗m can be exactly expressed as a superposition of at most O(2 αm ) stabilizer states, where α ≤ 0.3963, improving on the best previously known bound α ≤ 0.463. This furnishes, via known techniques, a classical algorithm which approximates output probabilities of an n-qubit 'Clifford + T' circuit U with m uses of the T gate to within a given inverse polynomial relative error using a runtime poly(n, m)2 αm . We also provide improved upper bounds on the stabilizer rank of symmetric product states |ψ ⊗m more generally; as a consequence we obtain a strong simulation algorithm for circuits consisting of Clifford gates and m instances of any (fixed) single-qubit Z-rotation gate with runtime poly(n, m)2 m/2 . We suggest a method to further improve the upper bounds by constructing linear codes with certain properties.
Recently proposed classical algorithms for simulating quantum circuits dominated by Clifford gates [2,3,4,5] have an exponential scaling only in the number of non-Clifford gates in the circuit. Such algorithms are based on representing the output state |Ψ = U |0 n of an n-qubit circuit U as a superposition where {|Φ i } are n-qubit stabilizer states, each of which requires only O(n 2 ) classical bits to store. Given such a representation we can compute an amplitude x|U |0 n corresponding to a given basis state x ∈ {0, 1} n by computing a weighted sum of the k amplitudes x|Φ i , each of which can be computed in polynomial time via the stabilizer formalism. It was shown in Ref. [4] that we can also approximate a measurement probability p(y) = |y y| out |Ψ 2 for an output register containing w ≤ n qubits and y ∈ {0, 1} w to within a given inverse polynomial relative error using a runtime linear in k and polynomial in the number of qubits. Simulations of this type are advantageous if a stabilizer decomposition Eq. (1) can be found with k 2 n . The stabilizer rank χ(Ψ) is defined as the minimum possible k over all decompositions of the form Eq. (1). Let us suppose that Ψ is the output state of a circuit U composed of Clifford gates and at most m T gates. In this case it is shown in Ref. [3] that χ(Ψ) ≤ χ(T ⊗m ) where |T = 2 −1/2 (|0 + e iπ/4 |1 ) is a single-qubit magic state. In this way the stabilizer rank of magic states χ(T ⊗m ) directly relates to the complexity of computing high-precision (exact, or inverse polynomial relative error) approximations of amplitudes and probabilities of Clifford+T circuits. Though it is not our focus in this paper, we note that one may also consider a less stringent weak simulation task where the goal is to approximately sample from the output distribution of the quantum circuit to within a given inverse polynomial total variation distance. Ref. [4] defines an approximate stabilizer rank and shows how it is similarly related to the runtime of weak simulation.
Computing the stabilizer rank of a given n-qubit state appears to be intractable. The main difficulty is that the number of stabilizer states grows as 2 Θ(n 2 ) , and searching for a decomposition of the form Eq. (1) by exhaustively considering k-tuples of n-qubit stabilizer states is out of the question, even for very small n and k. As discussed above, we are interested in upper bounding the stabilizer rank of magic states |T ⊗m as this captures the classical simulation cost for Clifford+T circuits with m T gates. The fact that this is a product state has been exploited in previous work to bound its stabilizer rank. Indeed, for any two states φ 1 , φ 2 we have the trivial sub-multiplicativity property . Previous works have upper bounded χ(T ⊗m ) by constructing low-rank decompositions of |T ⊗c for small c and then using this sub-multiplicativity. For example, the fact that |T ⊗2 can be decomposed in terms of two stabilizer states; implies χ(T ⊗m ) = O(2 m/2 ). This bound was improved in Ref. [3] by constructing stabilizer decompositions of |T ⊗c for c ≤ 6 using a heuristic simulated-annealing based computer search. The best decomposition found in Ref. [3] uses 7 stabilizer states to decompose |T ⊗6 , implying that χ(T ⊗6 ) ≤ 7 and therefore χ(T ⊗m ) ≤ O(2 αm ) where α = log 2 (7)/6 ≤ 0.468. The equality χ(T ⊗6 ) = 7 was conjectured to hold, suggesting that further improvement requires low-rank decompositions of 7 or more copies of |T . Ref. [6] shows that χ(T ⊗12 ) ≤ 47, which further reduces the upper bound on the exponent α to roughly 0.463. Though it is not our focus in this work, we note that existing lower bounds on the stabilizer rank of magic states are unsatisfying: the best known unconditional lower bound is χ(T ⊗m ) = Ω(m) [7]. On the other hand this quantity must increase exponentially with m unless #P-complete problems can be solved in sub-exponential time on a classical computer (see, e.g., Refs. [8,9,10]). In this paper, we describe a new method of constructing low-rank stabilizer decompositions of |T ⊗m and obtain the improved upper bound χ(T ⊗m ) ≤ O(2 αm ) with α = log 2 (3)/4 ≤ 0.3963. Unlike previous works, our strategy does not involve bounding χ(T ⊗c ) for some small c and then using sub-multiplicativity. Rather, we construct low-rank stabilizer decompositions of the state |T ⊗m by contracting certain entangled cat states in the magic basis. Along the way, we establish that χ(T ⊗6 ) ≤ 6, disproving the conjecture that it is equal to 7. We generalize our upper bound to the other species of magic states (see Fig. 1), as well as to other symmetric product states. Finally, we discuss a generalization of our method which is based on a family of quantum states corresponding to linear codes. Figure 1: The stabilizer octahedron, whose vertices are the six single qubit stabilizer states. The two types of magic states, |T , |F , correspond to the edges and faces of this octahedron, respectively [11]. It was conjectured in Ref. [3] that χ(T ⊗m ) = χ(F ⊗m ).

Magic cat states
For each m ≥ 1, define the m-qubit cat state in the magic basis where |T ⊥ ≡ Z|T . Similar to the states |T ⊗m , the stabilizer rank of |cat m is non- Furthermore it is easy to see that the stabilizer rank of this magic cat state is within a factor of 2 of the stabilizer rank of |T ⊗m . In particular, we have The upper bound follows because |cat m = is a stabilizer projector which does not increase the stabilizer rank. The lower bound in Eq. (4) follows by using the fact that |T T | = 1 2 (I + A) for a single-qubit Clifford A 1 , and therefore |T ⊗m = √ 2(|T T |⊗I)|cat m ∝ |cat m +(A⊗I)|cat m where the right-hand-side has stabilizer rank of at most 2χ(cat m ).
We shall take advantage of the fact that the stabilizer rank of the m-qubit magic cat state is less than that of |T ⊗m for small m. For m = 2, 6 we find the following stabilizer decompositions where we defined the stabilizer states 2 From these decompositions we see that 1 Namely A = e −iπ/4 SX where X is the usual Pauli matrix and S = diag(1, i). 2 A state is a stabilizer state if and only if its expansion in the computational basis is of the form where A is an affine subspace of F n 2 and l : F n 2 → F2 and q : F n 2 → F2 are linear and quadratic functions, respectively [12,13]. The latter inequality already disproves the conjecture that χ(T ⊗6 ) = 7. Indeed, using Eq. (4) we infer Following the strategy from Ref. [3] we may then use sub-multiplicativity of the stabilizer rank to obtain χ(T ⊗m ) = O(2 αm ) with α = log 2 (6)/6 ≤ 0.4308. But we can do better by using Eqs. (6) in a different way. Suppose we act with the stabilizer state cat 2 | on one qubit from each tensor factor in |cat 6 |cat 6 . The resulting state is cat 2 | 6,7 |cat 6 |cat 6 ∝ |cat 10 .
By expanding each state |cat 6 as a superposition of 3 stabilizer states and using the fact that cat 2 | is a stabilizer state, we obtain a stabilizer decomposition of |cat 10 with only 9 terms. Combining this with Eq. (4) we infer χ(T ⊗10 ) ≤ 18 and therefore χ( We can obtain further improvements by contracting many stabilizer |cat 2 states with many copies of |cat 6 to construct stabilizer decompositions of larger magic cats |cat m . In the limit of large m this gives the exponent α = log 2 (3)/4 ≤ 0.3963. To see this, in Fig. 2 we start with a chain of length containing a |cat 6 state at each site, and we contract nearest neighbours using cat 2 |. After the contraction the state of the remaining qubits is (up to normalization) |cat 4 +2 . Therefore a stabilizer decomposition of |cat 4 +2 exists using 3 stabilizer states. Using Eq. (4) we have Therefore one can decompose |T ⊗m using O(2 αm ) stabilizer states, where α = log 2 (3)/4. The above argument can be readily generalized to the other (Clifford-inequivalent) single-qubit magic state |F = cos(β)|0 Table 1: Known values of the stabilizer rank of |T ⊗m and |cat m for small m. The top row includes the bounds from Refs. [3,5] for m = 2, 3, 4, 5, 7 and the bound obtained from Eq. where |ψ i are the stabilizer states Therefore χ(cat 2 (F )) = 1 and χ(cat 6 (F )) ≤ 3, and so the above method used to construct stabilizer decompositions of |T ⊗m can be applied directly to |F ⊗m . We summarize the preceding discussion in the following theorem.
As we have already discussed, this provides the fastest currently known classical algorithm for estimating amplitudes or output probabilities of Clifford+T circuits as a function of the number of T gates. It is an interesting question whether or not Theorem 1 can be used to accelerate classical algorithms for other computational problems which are less directly tied to quantum computing 3 .
A natural question is whether the bound log 2 (3)/4 can be improved using our techniques. As we show in the Appendix, the stabilizer rank of |cat 6 is exactly 3, so an improvement using the above method requires the use of other cat states. In Table 1 we list the known values of χ(cat m ) for small m, none of which provide a better upper bound than the one in Theorem 1. Ultimately it might be possible to improve the bound by finding low rank decompositions of |cat m for larger m. Doing so may be slightly easier than finding decompositions of |T ⊗m since we always have χ(cat m ) ≤ χ(T ⊗m ). Note that the exponent α = log 2 (3)/4 in Theorem 1 is only attained asymptotically, i.e. in the limit of an infinitely long chain.

Theorem 2. For every angle θ we have χ(R
We use a chain as in Fig. 3, where the length (that is, the number of upper/gray circles) is given by = 5t + 1 for some t ∈ N (which is the number of lower/turquoise circles). The state after the contraction is |cat 24t+6 (R θ ) . By decomposing each |cat 6 (R θ ) in terms of 4 stabilizer states we obtain a decomposition of |cat 24t+6 (R θ ) using 4 6t+1 stabilizer states. Acting on the first qubit by |R θ R θ | gives the state |R θ ⊗(24t+6) , and increases the number of stabilizer terms in the decomposition by at most a factor of four (since |R θ R θ | can be written as sum of 4 Paulis). Hence χ(R ⊗(24t+6) θ ) ≤ 2 12t+4 , which implies the statement.
The stabilizer decompositions in Theorem 2 can be used to simulate quantum circuits consisting of Clifford gates and Z-rotations by a fixed angle θ. Indeed, if U consists of Clifford gates and m instances of a diagonal gate exp (iθZ/2) then one can write the output state U |0 ⊗n in the form where U is an n+m-qubit Clifford circuit obtained by replacing each instance of exp (iθZ/2) with an injection gadget that consumes an ancilla qubit initialized in the state |R θ (see for example, Section 2.3.1 of Ref. [5]). Using the stabilizer decomposition of |R θ ⊗m from Theorem 2, we obtain an expression for the output state of the circuit as a superposition of at most O(2 m/2 ) stabilizer states. We can then directly compute amplitudes x|U |0 n using runtime O(2 m/2 poly(n, m)). We can also estimate marginal probabilities to within an inverse polynomial relative error using the same asymptotic runtime via the "norm estimation" technique described in Ref. [4].
It is in fact possible to extend the exponential scaling in Theorem 2 to arbitrary symmetric product states. This is a consequence of the following property of the symmetric subspace of m qubits. In light of Theorem 2 the above lemma implies the following bound.
Combining this with Theorem 2 proves the statement.

Magic code states
We can extend the method described above by replacing the magic cat state by a more general entangled state of the form where |0 = |T , |1 = |T ⊥ , and L ⊆ F m 2 is a linear space of dimension k. Note that |cat m is recovered by taking L to be the m-bit repetition code. As with the cat states, every |L has stabilizer rank upper bounded by χ(T ⊗m ), since |L can be obtained from |T ⊗m using only Pauli measurements and postselection where Z(v) := Z v 1 ⊗ · · · ⊗ Z vm and {v i } ⊆ F m 2 is any basis of L. Theorem 4. Suppose there exist polynomials p and q and a constant γ > 0 such that Then for every linear code L ⊆ F m 2 of block length m and dimension k < m/2 we have Proof. Let G ∈ F k×m 2 be a generator matrix of L, that is, a binary matrix whose rows span L. Up to re-ordering the qubits we may assume that G is in the form for a binary matrix B of size k by (m − k). As L is spanned by the rows of G, the constraints x ∈ L, x 1 = x 2 = · · · = x k = 0 are uniquely satisfied by x = 0 m . We infer that which in turn implies χ(T ⊗m−k ) ≤ χ(T ⊗k )χ(L). Repeating the contraction in Eq. (16) in parallel times gives χ(T ⊗ (m−k) ) ≤ χ(T ⊗ k )χ(L ⊗ ). Letting get large and using Eq. (14) yields 2 γ (m−k) ≤ f ( )2 γ k χ(L ⊗ ), for some polynomial f ( ). Taking the binary log, rearranging, and using sub-multiplicativity of the stabilizer rank we get The second term vanishes for large which completes the proof.
The upper bound from Theorem 1 is recovered from Eq. (15) by setting L to be the sixbit repetition code (so that |L = |cat 6 ). More generally, to use Eq. (15) we need an upper bound on χ(L) which is significantly better than the trivial upper bound χ(L) ≤ 2 m−k . The latter upper bound is obtained by noting where {u i } is any basis of L ⊥ , and we define the Clifford operators A := e −iπ/4 SX and A(u) = A u 1 ⊗ · · · ⊗ A um . (To see this, note that |0 ⊗m ∝ x |x , A|0 = |0 , and A|1 = −|1 ).
Theorem 4 requires k < m/2, which is somewhat restrictive. Relaxing this requirement we can easily find magic code states with low stabilizer rank. For example, every Reed-Muller code R a,b (of degree a on b variables) corresponds to a stabilizer state if the degree is large enough. This is true since for any code L we can express |L in the computational basis as |L ∝ x∈L ⊥ e iπ|x|/4 |x , and if b/(b − a − 1) ≥ 4 then |x| = 0 mod 8 for all x ∈ R ⊥ a,b [15]. In this case we have |R a,b ∝ x∈R ⊥ a,b |x , and hence it is a stabilizer state. Another example is the magic code state associated with the 8-bit first order Reed-Muller code. Since R 1,3 = R ⊥ 1,3 and |x| = 4 for all x ∈ R 1,3 except the all-zero and all-one binary strings, we have the stabilizer decomposition hence χ(R 1,3 ) = 2 (one can easily show that |R 1,3 is not a stabilizer state).
In the preceding examples the code dimension k exceeds the threshold k < m/2 required by Theorem 4, and so one cannot obtain an upper bound on χ(T ⊗n ) that way. We leave it as an open question whether the bound in Theorem 1 can be improved using this formulation in terms of linear codes, i.e. whether there exists a linear code L of block length m and dimension k < m/2 such that log 2 (χ(L))/(m − 2k) < log 2 (3)/4. Finally, we have χ(cat 7 ) ≤ χ(cat 8 ) ≤ 6 which follows from cat 2 | 4,5 |cat 4 |cat 6 ∝ |cat 8 .
The left hand side is easily seen to have rank at most 6 since cat 2 is a stabilizer state and χ(cat 4 ) ≤ 2 and χ(cat 6 ) ≤ 3.
Next let us discuss the matching lower bounds for 2 ≤ m ≤ 6. Clearly there is nothing to show for m = 2 since all nonzero states have stabilizer rank at least 1.
To establish the lower bounds for m = 3, . . . , 6 it will be convenient to define the Pauli spectrum [16] of an m-qubit state ψ as the multiset where P m is the set of m-qubit Pauli operators. Note that all elements of the Pauli spectrum of a stabilizer state are either +1, −1, 0. Moreover, we have PS(C|ψ ) = PS(|ψ ) for any Clifford unitary C.
To show that χ(cat 4 ) ≥ χ(cat 3 ) ≥ 2 we need only show that cat 3 is not a stabilizer state. To see this we note that P S(cat 3 ) contains elements which are not +1, −1, 0. In particular, cat 3 |X ⊗ X ⊗ I|cat 3 = 1/2 and therefore |cat 3 is not a stabilizer state.
In the remainder of this appendix we show that χ(cat 6 ) ≥ χ(cat 5 ) ≥ 3. We use the fact that rank-2 states can be brought into a canonical form using the following lemma.
Proof. Let D be a Clifford such that D|φ 1 = |0 n and consider the stabilizer state D|φ 2 .
Let H be the single-qubit Hadamard on qubit j ∈ [n] and write H(y) = n j=1 (H j ) y j for any binary string y ∈ {0, 1} n . Since D|φ 2 is a stabilizer state it can be written in the so-called CH-form [5] as for some r, s ∈ {0, 1} n and some Clifford U c such that U c |0 n = |0 n . We can furthermore find a permutation of the qubits P such that P H(r)|s = |x ⊗ H |r| |y for some binary strings x ∈ {0, 1} n−|r| , y ∈ {0, 1} |r| . First consider the case in which φ 1 |φ 2 = 0. Then x = 0, and one can directly confirm that C = (I ⊗ Z(y))P (U c ) † D satisfies Eq. (20) with a = 0 and b = n − |r|.
Next suppose φ 1 |φ 2 = 0. In this case x = 0 and therefore we may choose the permutation P such that x = 1x for some (n − |r| − 1)-bit string x . We then choose C = (V ⊗ Z(y))P (U c ) † D where V is a product of CN OT gates controlled on the first bit of x such that V |x = |10 n−|r|−1 . Again we can directly confirm Eq. (20) with a = 1 and b = n − |r| − 1. Proof. Below we shall use the following fact about the Pauli spectrum of cat 5 : | cat 5 |P |cat 5 | ∈ {0, 1/4, 1/2, 1} for all P ∈ P 5 Since there are Paulis for which | cat 5 |P |cat 5 | / ∈ {0, ±1}, we know that cat 5 is not a stabilizer state.
Suppose that cat 5 can be expressed as a superposition of two stabilizer states. Then by Lemma 2 there is a Clifford C such that for some a ∈ {0, 1}, 0 ≤ b ≤ n, and γ ∈ C. The fact that |cat 5 is not a stabilizer state implies |γ| > 0. We can furthermore assume that |γ| ≤ 1, if necessary by applying a Clifford that permutes |0 n and |1 a 0 b + n−a−b then dividing by |γ|.
Recall from Eq. (22) that each of these three Pauli expectation values must be in the set {0, 1/4, 1/2, 1}. Solving for r, θ which satisfy this constraint we get For each of the four states Eq. (23) corresponding to the above choices of γ we computed the Pauli spectrum using a computer and compared with that of |cat 5 . We found that these multisets were different (in particular, the number of Paulis with expected value equal to zero is 710 for each of the four candidate states, but 782 for |cat 5 ) and therefore we conclude that cat 5 cannot be written as a superposition of two orthogonal stabilizer states. Next let us suppose (to reach a contradiction) that cat 5 can be written as a superposition of two nonorthogonal stabilizer states, i.e., the case a = 0. Again we may take b = 1, so C|cat 5 = 1 (1 + |γ| 2 + Re(γ)/4) 1/2 (|0 5 + γ|0+ 4 ) .
Thus we have three candidate states Eq. (24) corresponding to the above three choices of γ. For each of them we computed the Pauli spectrum using a computer and compared with that of |cat 5 . We found that these multisets were different and therefore reach a contradiction. We conclude that |cat 5 cannot be written as a superposition of two stabilizer states.