Fast Stabiliser Simulation with Quadratic Form Expansions

This paper builds on the idea of simulating stabiliser circuits through transformations of quadratic form expansions. This is a representation of a quantum state which specifies a formula for the expansion in the standard basis, describing real and imaginary relative phases using a degree-2 polynomial over the integers. We show how, with deft management of the quadratic form expansion representation, we may simulate individual stabiliser operations in $O(n^2)$ time matching the overall complexity of other simulation techniques [arXiv:quant-ph/0406196, arXiv:quant-ph/0504117, arXiv:1808.00128]. Our techniques provide economies of scale in the time to simulate simultaneous measurements of all (or nearly all) qubits in the standard basis. Our techniques also allow single-qubit measurements with deterministic outcomes to be simulated in constant time. We also describe throughout how these bounds may be tightened when the expansion of the state in the standard basis has relatively few terms (has low 'rank'), or can be specified by sparse matrices. Specifically, this allows us to simulate a 'local' stabiliser syndrome measurement in time $O(n)$, for a stabiliser code subject to Pauli noise -- matching what is possible using techniques developed by Gidney [arXiv:2103.02202] without the need to store which operations have thus far been simulated.


Introduction
Quantum computation, in general, is expected to be impossible to efficiently simulate with conventional ('classical') computers. That is: for an idealised quantum circuit of one-and two-qubit gates and single-qubit measurements, it is expected that there is no randomised classical algorithm which can (either exactly or with even a modest margin of error) sample from the output distribution of the measurement outcomes, in time which scales polynomially in both the number of qubits and the number of operations. This, together with quantum algorithms providing speed-ups over the best known classical algorithms [5,6] raises the prospect of significant computational advantage through building quantum computers. However, this also makes it difficult to test prototype quantum computers.
Fortunately, there is an important subclass of quantum circuit which can be efficiently simulated: stabiliser circuits. These are circuits in which the measurements are restricted to projective single-qubit measurements onto the eigenstates of the Pauli matrices, and the unitary operations are restricted to the Clifford group -those unitary operators U , for which U P U † is a 'Pauli operator' (a tensor product of ±I, ±X, ±Y , and ±Z) if P is also a Pauli operator. Such a Pauli operator P expresses a measurable property of an n-qubit state |ψ , e.g., expressing some correlations between hypothetical Pauli measurements on different qubits if |ψ is a +1-eigenstate of P . In this case, U P U † represents a similar measurable property of the state U |ψ . This observation forms the basis of the 'stabiliser formalism' [7], which has proven extremely fruitful for the development of techniques for quantum error correction [8][9][10]. It also lay behind the original proof of the Gottesman-Knill theorem [7], which is that a stabiliser circuit can be simulated by a classical algorithm in polynomial time, when acting on a standard basis state as input (or any other stabiliser state |ψ , that is, a state which can be characterised as a +1-eigenvector of n independent commuting Pauli operators). This involves maintaining a stabiliser tableau: an array which describes a list of Pauli operators S 1 , S 2 , . . . , S n , characterising |ψ as the unique state which is a +1-eigenstate of each S k . Implicit in the original result of Ref. [7] is that, for a stabiliser circuit on n qubits, (a) each single-qubit or two-qubit Clifford gate can be simulated in time O(n) by transformations of individual columns of the tableau, and (b) measurements may be simulated in time O(n 3 ) by reduction to Gaussian elimination. Simulation of stabiliser circuits, with an eye to improving performance, remains an active field of research. Approaches to doing so, apart from the stabiliser formalism, do exist. For instance, Anders and Briegel [2] use the fact that every stabiliser state is equivalent up to a single-qubit Clifford operation to a 'graph state' as the basis for techniques to simulate stabiliser operations in time O(n 2 ). Bravyi et al. [11,Section 4.1] also provide techniques to simulate stabiliser operations, using a format which slightly generalises the stabiliser formalism, requiring O(n 2 ) time to simulate a stabiliser operation in the worst case. Another way in which we may represent stabiliser states (and some other states apart from stabiliser states) is through an for some integer 0 ≤ r ≤ n, a function f : {0, 1} r → {0, 1} n which may be interpreted as a linear or affine transformation mod 2, Q a polynomial of degree at most 2, and C a normalising constant. Extending the terminology of Ref. [12] slightly, we call a form such as in Eqn.
(2) a quadratic form expansion. Examples of these expansions are essentially as old as the stabiliser formalism in the context of error correction [13]; and if generalised to express unitary operators as well as states, have proven useful in the study of circuits of Clifford operations [14][15][16], measurement based quantum computation [12], and equality testing for unitary circuits [17]. In particular, implicit in the observations of van den Nest [15], and the techniques of Amy [17] are polynomial-time algorithms to simulate stabiliser circuits. Despite the availability of other simulation methods, the stabiliser formalism (including minor variations) remains the de facto standard for reasoning about stabiliser circuits. This is partly due to an elaboration described by Aaronson and Gottesman [1], who described improved simulation techniques resulting in an O(n 2 ) bound to simulate measurements. Note, however, that these methods do not track global phase factors, which may be important for applications which leverage the simulation of stabiliser circuits to perform more difficult simulation tasks, as in the work of Bravyi et al [11].
As an arbitrary stabiliser state on n qubits requires at least 1 2 n 2 bits to represent [18,Corollary 21], a careful choice of data structure is needed to simulate arbitrary stabiliser operations in time asymptotically less than O(n 2 ). Failing this, we may consider techniques which permit better performance for certain operations or under certain conditions. For instance, the result of Anders and Briegel [2] is motivated by the ability to represent singlequbit operations and measurements on product states in O(1) time, and representing other operations in time O(d 2 ), for a parameter which may only be bounded by d ∈ O(n) in the worst case but which in some cases may be substantially smaller. Similarly, previous work of ours [19] describes techniques to simulate specialised Clifford circuits for concurrent entanglement swapping in limited quantum architectures -involving only Pauli and controlled-NOT gates, but X-eigenstate preparations and measurementswith a representation of the state as a quadratic form expansion as in Eqn. (2). 1 In that work, the unitary gates could be performed in O(r) time, and measurements in time O(nr 2 ), where r ≤ n governs the sum index as in Eqn. (2). Guan and Regan [20] describe techniques to evaluate outcome probabilities for total measurements following a unitary Clifford circuit, which would be advantageous when either the Clifford circuit contains very few Hadamard gates, or the number M of Clifford gates in the circuit scales is bounded by M ∈ O(n 2/(ω−1) ). 2 However, a more natural specialisation is presented by Gidney [4], motivated by the use case of simulating error correction procedures. There, he describes a refinement of the algorithm by Aaronson and Gottesman [1] to simulate any 'deterministic measurement' (of a multi-qubit observable P , for which |ψ happens to be an eigenvector) in time O(n).
In this article, we describe explicit techniques to simulate stabiliser operations in O(nr) ⊆ O(n 2 ) time, on n-qubit stabiliser states represented by quadratic form expansions, where r (the 'rank' of the quadratic form expansion) again governs the sum index as in Eqn. (2). This meets the same general asymptotic bound as the stabiliser formalism, and may be made more efficient for states that happen to have fewer terms when expanded over the standard basis, as in our earlier work [19]. (Our results mainly concern 'weak' simulation, which is to say sampling from the distribution of measurement outcomes: see the remarks towards the end of Section 2.1.) Furthermore, our techniques allow for significant improvements in run-time, in particular cases where the function f and the quadratic form Q(x) may be represented by sparse data structures. Remarkably, our techniques allow us to compute the outcomes of deterministic single-qubit measurements (in the X-, Y -, or Z-basis) in constant time. Together with partial use of our tighter run-time bounds for sparse data structures, this allows us to simulate deterministic measurements of 'local' multi-qubit Pauli operators 3 in O(n) time, a bound which again may be tightened for particular Pauli observables to be measured or when the function f is represented by a sparse data structure. This matches the performance of techniques presented by Gidney [4] for deterministic Pauli measurements, and suggests that our techniques may be well-suited to simulation of encoded stabiliser circuits.
The structure of the article is as follows. In Section 2 we briefly introduce the stabiliser operations of interest, describe quadratic form expansions, and present the format for quadratic form expansions which is used by our techniques. Section 3 describes data structures and helpful techniques and subroutines to manage quadratic form expansions (with technical details deferred to the Appendices), and the computational model which we assume for bounding the run-time complexity. Section 4 then describes the procedures to simulate stabiliser operations on quadratic form expansions together with their run-time bounds. In particular, Section 4.6 summarises the run-times of our simulation techniques for a number of individual stabiliser operations. Our main results concerning the asymptotic run-time to simulate circuits and other procedures involving multiple stabiliser operations are presented in Section 5. (While our results mainly concern 'weak' simulation, Sections 5.1 and 5.2 do contain some results which bear on 'strong' simulation, i.e., computing the probability of specific outcomes for a given set of measurements.) Finally, we conclude in Section 6 with a discussion of how our results relate to other simulation methods or results connected to path-sums, and the potential applications of our techniques.

Preliminaries
In this Section, we broadly describe the stabiliser circuit model, and introduce quadratic form expansions as a simple approach to simulating them. This will serve to show how quadratic form expansions can be used pedagogically, as an alternative to the stabiliser formalism, to demonstrate the Gottesmann-Knill Theorem. It also serves as the starting point to describe the more involved technical contributions of this article.
We commonly use bold-face letters such as x to represent column vectors, whose transpose x = [ x 1 x 2 · · · ] is a row-vector. In particular, we use e j to denote some vector whose only non-zero entry is a 1 in position j; the length of the vector will be clear from context. We use diag(a) to denote a matrix with coefficients a 1 , a 2 , etc. on the diagonal and 0 elsewhere.

Stabiliser circuits
We consider stabiliser circuits consisting of one-or two-qubit Clifford gates and Pauli measurements. The 'Clifford gates' consist of operations such as These are the eigenstates of the Z, X, and Y operators, and we call them the Z-basis, the X-basis, and the Y -basis, respectively. These states may all be mapped to the standard basis {|0 , |1 } by single-qubit operations, so that single-qubit Z-basis measurements and Clifford gates suffice to simulate all Pauli measurements. However, we will prefer to simulate these measurements without such reduction to Z-basis measurements.
We consider stabiliser circuits where measurements may occur in the middle of the circuit, and where operations that are performed after any measurement may depend on the measurement outcome. The problem in which we are mainly interested is weak simulation [15,22] of such stabiliser circuits. This is the problem of sampling from the output distribution of the measurements from a stabiliser circuit -or, in other words, to simulate the behaviour of a stabiliser circuit with measurements. This is to be set against 'strong simulation', which is the task of evaluating the probability of some of some particular result for a subset of the measurement outcomes. While we are interested in principle in strong simulation as well (and presenting in Sections 5.1 and 5.2 some results concerning strong simulation), our main objective is to describe efficient algorithms to sample from the output distribution of measurements of stabiliser circuits.

Quadratic form expansions
It does not require great mathematical sophistication to use the stabiliser formalism. However, the meaning behind stabiliser tableaus and transformations of them is slightly indirect for people first encountering them. It is also less easy to motivate to those who will not also need them for work in quantum error correction or related topics. As an alternative, we present quadratic form expansions as a means of more directly representing stabiliser states, and simulating the effects of stabiliser operations on them.
We consider normalised state-vectors which can be represented as follows: where C ∈ C * combines a normalising factor and possibly a global phase, r ≥ 0 is an integer, A is an n × r matrix (the 'expansion matrix'), b an n × 1 vector, and Q : Z r → Z is a function which determines either a real or an imaginary relative phase. (For the 'degenerate' case r = 0, the sum is just a single term consisting of |b .) Both A and b have coefficients over {0, 1}, and the expression Ax ⊕ b represents the reduction of Ax + b modulo 2. (Naturally, the expression in the exponent of i may be evaluated modulo 4; our techniques allow us to work gracefully with both forms of modular arithmetic at the same time.) This describes |ψ as a superposition, where the terms of the superposition are either purely real or purely imaginary functions of a vector x which determines the term of the superposition. When the expansion matrix A has rank r as a matrix modulo 2, then this representation serves to describe the correlations (or lack of correlations) in the outcomes of potential single-qubit standard basis measurements. For instance: • The outcomes of a Z-basis measurement on three qubits in the state 1 √ 2 (|000 + |111 ) would yield the same result. This state has a quadratic form expansion with normalisaion C = 2 −1/2 , rank r = 1, a matrix A = [ 1 1 1 ] of shape 3 × 1, and Q(x) = 0, b = 0. This describes how the Z-basis measurement outcomes for all of the qubits are characterised by a single bit x ∈ {0, 1}.
Thus the expansion matrix A describes, usually in a non-unique way, a coherent superposition over different standard basis terms, each obtained from some bit-string x ∈ {0, 1} r . If we were to replace the relative phase i Q(x) by a more general function exp(iπQ(x)) for Q : Z r → R, where Q is a polynomial with real coefficients and degree at most 2, this would be a slight generalisation of what is called a quadratic form expansion [12] for |ψ . 5 In principle, if one does not impose any bound on the length r of the summation index, one may represent an arbitrary quantum state in this way: see Refs. [12,17] for more details. Even if the relative phases are restricted to i Q(x) for an arbitrary function Q : Z r → Z, this representation could be used to express any state generated by a Clifford+T circuit (see for instance the discussion on page 36 in Section 6). Instead, we consider the case where Q : Z r → Z arises from a symmetric integer matrix Q by the equation We then refer to Q as a Gram matrix for Q. Essentially as a corollary of any one of Refs. [12,[14][15][16], in the case that 0 ≤ r ≤ n, where A has rank r as a matrix modulo 2, and where C = 2 −r/2 , the result of the expression Eqn. (5) is a normalised stabiliser state; and conversely, any normalised stabiliser state can be represented under such constraints. We introduce the convention of using a symmetric matrix Q to govern the relative phases, which allows us to represent the imaginary phases from S gates and the sign phases from CZ gates on the same footing. For instance, we may represent the state |ψ 1 = (I ⊗ S) |++ = 1 2 |00 + i |01 + |10 + i |11 ) and the state |ψ 2 = CZ |++ = 1 2 |00 + |01 + |10 − |11 ) both by quadratic form expansions, as In each case, we take C = 1 2 , A = I 2 , and b = 0, and set Q to the appropriate 2 × 2 matrix in the imaginary exponent. Note that, as Q is symmetric, relative phases which involve distinct variables x j and x k must arise from two contributions x j x k + x k x j in Q(x) and thereby represent a phase (−1) x j x k rather than i x j x k .
In the discussion above, the case where the expansion matrix A has rank r is significant: it implies in particular that distinct terms of the quadratic form expansion are orthogonal. In this case, a quadratic form expansion representing a normalised state would satisfy C = ω · 2 −r/2 , for ω ∈ C a phase factor. (We discuss the importance of the rank of A to simulation techniques below.) Our techniques also allow us to constrain the value of this global phase to a power of τ = √ i = exp(iπ/4), for states obtained by simulating stabiliser operations on states for which ω = 1. In order to simplify discussion of simulation techniques for stabiliser circuits, except where otherwise specified, any reference to a "quadratic form expansion" from this point on should be understood to refer to an expression which represents a normalised stabiliser state, for integer matrices A and Q where furthermore Q is symmetric, and where Ax ⊕ b denotes the reduction modulo 2 of the integer vector Ax + b.

On the role of rank in simulation
The number r ≥ 0 of columns required for the expansion matrix is related to the number of bits required to generate a particular term of a quadratic form expansion. To this point, we have not imposed any restrictions on r, though we have alluded to the importance of the case where r is the rank of A considered as a matrix modulo 2.
The importance of this property is as follows. Let null A represent the nullspace of A considered as a matrix modulo 2 (i.e., the set of vectors x for which Ax ≡ 0 mod 2). Then, let rank A = r − dim(null A) be its rank as a matrix modulo 2. For each y ≡ Ax mod 2, we also have y ≡ A(x + δ) mod 2 for each δ ∈ null A. As a result, each term |y in a quadratic form expansion is one out of 2 dim(null A) colinear terms, which in some cases might interfere destructively. If r = rank A, each term |y in a quadratic form expansion is in fact orthogonal to the others, so that there is no destructive interference of outcomes. As a result, to determine whether a measurement outcome on all qubits is possible, it is sufficient to determine whether it is present. However, if r > rank A, then it is in principle necessary to consider how the terms constructively or destructively interfere to determine whether a measurement outcome is possible.
For the above reason, it is practically motivated to consider quadratic form expansions in which r = rank A -and we will often refer to r as the rank of the quadratic form expansion for this reason. 6 This raises the question of how the rank r may change under simulation of different operations.
Note that because the Pauli operations of Eqn. (1) and the Clifford gates CX, CZ, and S of Eqn. (3) are 'monomial' (having at most one non-zero entry per row/column), it is not difficult to show that they may be simulated by a transformation of a quadratic form expansion which does not change the number of columns r of the expansion matrix. In particular, the Pauli operators and the diagonal operations S and CZ may be simulated with no changes to the expansion matrix at all; and CX may be simulated by a simple row-operation on A. To contrast, as the Hadamard gate does not map standard basis states to other standard basis states, simulating it will require changes to the value of r, either to increase or decrease it, to ensure that it corresponds to the rank of the expansion matrix. (Similar remarks apply for X-and Y -basis measurements.) As a result of the fact that H is self-inverse, and the way in which quadratic form expansions emphasise the role of the standard basis, this requires an analysis of when a Hadamard gate leads to destructive interference of terms. This ultimately requires that we solve a system of equations to determine whether such destructive interference takes place. If we do not constrain the value of A in any way, this will in practise require Gaussian elimination, at a cost of O(nr 2 ) ⊆ O(n 3 ). As this does not compare favourably to the O(n 2 ) time to simulate any elementary stabiliser operation using the stabiliser techniques of Aaronson and Gottesman [1], one may consider what techniques would allow one to avoid the worst-case performance of Gaussian elimination.

Our contribution
In this article, we describe techniques to simulate stabiliser operations using quadratic form expansions where r = rank A, in time O(nr) ⊆ O(n 2 ). This is done by maintaining A in a particular form which makes it easy to certify that A has rank r, and in so doing essentially amortise the cost of Gaussian elimination. We then describe techniques to simulate stabiliser operations using quadratic form expansions, more explicitly than has been done in the related literature [12,[14][15][16][17] and with a clear asymptotic analysis. The summary of the asymptotic run-times of each of our subroutines is provided in Section 4.6.
The headline complexity of O(n 2 ) to simulate stabiliser operations with quadratic form expansions, matches that of existing techniques [1,2,11]. This obscures important differences in how efficiently individual operations may be simulated. For instance, stabiliser-based techniques [1,11] can simulate each of the Clifford gates of Eqn. (3) in time O(n), while our techniques variously require O(r 2 ) or O(nr). (When r ∈ O( √ n), at least those operations which may be simulated in time O(r 2 ) may be asymptotically as efficient or more efficient than in the stabiliser formalism; though we do not expect that it will be easy in general to determine that r is bounded in this way for a given stabiliser circuit.) The principal advantage provided by quadratic form expansions is that they lend themselves to sparse matrix representations. Thus, the bounds O(r 2 ) and O(nr) themselves obscure substantially tighter bounds, that hold when the expansion matrix A and the Gram matrix Q are sparse. These tighter bounds allow us to describe procedures to simulate deterministic measurements of local stabiliser operators in time O(n).
We give a careful account of the complexity of each of our subroutines, in terms of the number of non-zero coefficients in the rows and columns of A and Q. This, in combination with the fact that our techniques for maintaining rank involves maintaining at least r columns in A with precisely one non-zero coefficient, may be expected to provide a significant advantage for simulations when the stabiliser circuit in question has enough structure to certify that the states may be represented by such 'sparse' quadratic form expansions.

Managing quadratic form expansions
In this Section, we describe a number of techniques and procedures which will serve to simplify our account of our simulation techniques for quadratic form expansions, and of the run-time analysis for those simulation techniques when the quadratic form expansion has a sparse expansion matrix A and Gram matrix Q. We do so by describing certain constraints on quadratic form expansions, and ways of transforming quadratic form expansions which may be involved in simulating certain stabiliser operations. We then describe a list of helper subroutines which encapsulate these results (and also Lemmas 2 and 3) for use in procedures to simulate stabiliser operations.

Principal row forms
Following the observations of Section 2.3, we wish for Eqn. (8) to represent the decomposition of |ψ without repeating any standard basis terms, and in particular, so that it does not contain terms which cancel. To this end we impose the condition that the expansion matrix A has rank r.
As we simulate operations by modifying the matrix A, we must determine how to maintain the invariant of A having rank r. In particular, we must bound the number of columns of A above by n at the end of each simulated operation. At the same time, we wish to avoid performing Gaussian elimination when the rank of A is in question, with a worst-case performance of O(nr 2 ) ⊆ O(n 3 ). Avoiding this cost is one of the main results of Aaronson and Gottesman [1], who do so by roughly doubling the amount of stored data. For quadratic form expansions, we may instead avoid Gaussian elimination by imposing constraints on the form of A.

Definition 1.
A matrix A with shape n × r is in principal row form if each e k occurs at least once as a row of A; that is, if there is a map p : {1, . . . , r} → {1, . . . , n} such that e p(k) A = e k for each 1 ≤ k ≤ r. We call such a map p, a principal index map for A; a row j = p(k) for 1 ≤ k ≤ r is a principal row of A.
For A in principal row form, each column Ae c of A is the only column which has a nonzero entry in row p(c), by construction. It then follows that the columns are linearly independent, so that A has rank r. The choice of principal row p(c) which stores a given vector e c may be non-unique for a given A; it is enough for our purposes to indicate one such row for each 1 ≤ c ≤ r. The role of the principal index map is important enough that we include it in the data describing a quadratic form expansion, as described in Section 3.2.
In simulating operations on a quadratic form expansion, we must occasionally perform operations to the matrix A, which could take it out of principal row form. We must then do additional work to prevent A from being put out of principal row form, or to put it back into principal row form, by performing suitable column operations and changes to the index map p. In Section 3.6, we describe some procedures to assist with maintaining A in principal row form.

Data structures and programming model
A procedure to simulate a stabiliser operation on an n-qubit state |ψ , which is given as a quadratic form expansion, will in practice mean a procedure which acts on a tuple E = (n, r, g, Q, A, b, p) specifying that quadratic form expansion. These consist of the parameters required to specify a quadratic form expansion as in Eqn. (8), together with a principal index map p for the expansion matrix A.
We are particularly interested in the case of quadratic form expansions, where the matrices A and Q may be sparse. That is, we suppose that there are integers s, t, w ≥ 0 which bound from above, respectively, the number of non-zero coefficients in each row of A, in each column of A, and in each row/column of Q.
The vector b may be stored straightforwardly as an array or buffer of n bits, and the principal index map we suppose to be represented as an integer array. 7 However, motivated by the setting where s, t, w n, our techniques rely on the matrices A and Q being stored using a sparse matrix structure. In particular, for either of these matrices: • We suppose that a record of the number of non-zero entries in any row or column j is maintained, and that it is possible to iterate over the non-zero entries in such a row or column (in order), omitting any zero coefficients in doing so.
• We also suppose that the data structure allows constant-time insertion or deletion of non-zero entries in between two given non-zero entries.
(This could be achieved with an array of pointers to nodes, which themselves form a listlike structure along the rows, columns, and diagonal.) We suppose also that an explicit Figure 1: A schematic representation of some quadratic form expansion, in which the expansion matrix A and Gram matrix Q are stored with sparse data structures, and where A is in principal row form with some index map p. Black dots represent the locations of non-zero entries of Q, A, and b. Each row/column of A and of Q are represented in a way which stores only the non-zero entries of each row and column (and all of the diagonal elements of Q). Some rows of A may be zero, in which case the corresponding qubit is in some fixed standard basis state |b j ; in contrast, each column of A is non-zero. In particular, each column 1 ≤ c ≤ r of A has a designated 'principal row' j = p(c) which stores the row-vector e c , and is therefore non-zero only in column c. (While we depict these rows differently above for the sake of clarity, the principal rows would be stored in the same way as the other rows.) The management of the 'principal index map' p is central to our results. In some cases, the choice of principal row may be arbitrary: in the example above, one could equally well select p(4) to be the final row of A instead of the row indicated. In other cases, there is only one possible choice of principal row for some column c, as when there is only one row which stores e c (as with c = 1 above); sometimes this row is also the only row in which column c itself is non-zero (as with c = 3 above). In the course of simulating operations on a quadratic form expansion, it may be necessary to transform a row which is designated as a principal row j = p(c) for some column c. When this happens, we may attempt to select a new row to act as the principal row for column c; in some cases this may require a transformation A → A through column operations, to produce a row j of A which is equal to e c . entry is stored for each element on the diagonal of Q, whether or not that entry is nonzero, allowing constant-time access to and modification of diagonal entries. A schematic representation of these data structures is shown in Figure 1.
Our results pre-suppose a machine with random access memory (RAM), where memory accesses, comparisons of bit-values, and comparisons of integers can all be performed in constant time. This model accurately reflects the complexity of the comparison and arithmetic operations which are performed on much of the data, which may be represented by integers which are bounded above by a constant. While our model neglects theoretical poly log(n) factors for memory accesses and arithmetic on integers in general, these polylog factors may in practise also be bounded by constants (e.g., when simulating circuits on n < 2 64 qubits), and would in any case be swamped by overheads arising from realising these operations on realistic computer architecture (e.g., cache misses and memory word size).
For the sake of brevity, our procedures do not act on the tuple E = (n, r, g, Q, A, b, p) explicitly. Instead, we suppose that E is taken implictly an argument to any procedure to modify the quadratic form expansion which it represents, and may be modified as a side effect. In particular, our results concern the complexity of simulating operations on a quadratic form expansion in place, which is in effect to say operating on the quadratic form expansion without making a copy of any of its data. (In particular, it should be considered to be passed by reference rather than by value.) As a consequence, the original values stored in any data structure are not available after modification, unless it has been explicitly copied elsewhere.

Mixed modulus arithmetic
To maintain the expansion matrix A in principal row form, it will occasionally be necessary to perform column operations on A, of a sort that correspond to a change in variables of the index x. Such a change of variables will involve a corresponding transformation of the Gram matrix Q -somehow taking into account the fact that while Ax ⊕ b may be evaluated modulo 2, we cannot treat Q as merely being defined modulo 2 (as this would obscure the difference between relative phases of i and −i).
First, we note that while Q cannot be reduced modulo 2, the majority of its coefficients can be reduced modulo 2, precisely because Q is also symmetric: , for a matrix ∆ with only even coefficients and which is zero on its diagonal, then x Q x ≡ x Q x (mod 4); and conversely.
We prove this (easy) result in the Appendices (Lemma 17, p. 41). Then, in an imaginary exponent, we may evaluate the matrix Gram matrix Q modulo 4, but in particular also reduce any off-diagonal coefficient of a Gram matrix in an imaginary exponent mod 2. This fact may be used to reduce the number of non-zero coefficients in sparse matrix representations of Q.
It will also prove useful to occasionally perform a change of variables on the vector x, corresponding to some particular transformation of the expansion matrix A. This is important enough to warrant an explicit result concerning such changes of variables, requiring careful treatment of the mixed-modulus arithmetic: Lemma 3. Let 0 ≤ r ≤ n be integers, A an n × r integer matrix, Q a symmetric r × r integer matrix, b ∈ {0, 1} n , and g ∈ Z. Let E be a unimodular integer matrix, 8 and A ≡ AE (mod 2) and Q ≡ E QE (mod 4). Then where in particular the index of summation on each side of the equation ranges over Proof. This result rests on some number-theoretic results which we defer to the Appendix. Apart from this, we may prove the result as follows. If we let C = E −1 , we then have For y = Cx ∈ Z r , letỹ be its residue modulo 2. While y is only equivalent toỹ modulo 2, this is in fact enough (Lemma 16, p. 41) to show thatỹ Q ỹ ≡ y Q y (mod 4). Furthermore, as E and C are invertible integer matrices, they are invertible modulo 2 as well, so that ỹ ∈ {0, 1} r ∃x ∈ {0, 1} r :ỹ ≡ Cx (mod 2) is equal to the set {0, 1} r itself. Thus, we have the Lemma then holds, by relabeling the index of summation on the right-hand side from y to x.
These results form the basis of simple subroutines, described in Section 3.6 and defined in Appendix B, to allow us to work more easily with the two forms of modular arithmetic involved in quadratic form expansions.

Fixing index bit values
On occasion, we may wish to assign a fixed value to some particular bit of x ∈ {0, 1} r in a quadratic form expansion. For instance, in the case of a Z-basis measurement, we may wish simply to select for only those terms in which a particular value z ∈ {0, 1} for some bit x k is realised. 9 More subtly, when simulating a Hadamard operation, it may become necessary to isolate those terms for which Ax ⊕ b |ψ = 0 by fixing the value of some bit x k . For convenience, we suppose that it is the final bit x r which we wish to fix (to fix any other bit, we may first perform a simple change of variables as described in Lemma 3). Given a state |ψ as in Eqn. (8), consider the (possibly subnormalised) vector |ψ obtained by selecting only those terms such that x r = z for some constant z ∈ {0, 1} : To see how this expression for |ψ may be simplified, we may consider a block structure for Q, If we let Q =Q + 2z diag(q ), let a ∈ {0, 1} n be the final column of A, let A be the matrix consisting of the first r − 1 columns of A (omitting the final column, a), and let where g := g + 2 z u. Note that for any j = p(k) for which 1 ≤ k < r, row j of A is a truncated version of row j of A, omitting only the final (zero) coefficient, so that e p(k) A = e k . Then A is also in principal row form, with index map also given by p (albeit restricted to inputs 0 ≤ k ≤ r−1). Thus, to simulate fixing x r = z, and in particular to reduce the number of columns involved in A, it suffices to compute A , b , Q , and g as above. Note that, due to our convention of treating normalisation as being closely connected to the rank, we do not have a general way of representing or otherwise accounting for the additional factor of 1 √ 2 on the right-hand side of Eqn. (15). Note also that the number r−1 of columns of the matrix A , and thus the dimension of the summation index, may now differ from the integer r stored as the rank. Any simulation technique which relies on the above analysis, must independently account for these details to ensure that the result is a quadratic form expansion describing a normalised state.

Eliminating columns which are entirely zero
In transforming quadratic form expansions, we may temporarily produce an expansion as in Eqn. (12) in which the matrix A is not full rank. When this occurs in our analysisparticularly, in describing the simulation of Hadamard operations and measurements in the X-and Y -eigenbases -one of the columns of A is in fact entirely zero. This affords us the opportunity to reduce the number of columns of A by one, or possibly even by two, as we describe here.
We depart slightly from the usual assumption that the state |ψ is represented as in Eqn. (8). Suppose that A has rank r, but has shape n × (r+1) with some column c which is entirely zero; and that For our simulation techniques, this only occurs when A is almost in principal row form, in that it is equipped with a function which is nearly a principal index map p : We may simplify our analysis by performing a change of variables, by defining A (1) = AE and Q (1) = E QE, for the permutation matrix E = I ⊕ (e c ⊕ e r+1 )(e c ⊕ e r+1 ) which serves to swap columns c and r+1 of A. If we let b (1) = b to keep the superscript indices in lock-step, then by Lemma 3 we have If we define p : {1, . . . , r} → {1, . . . , n} so that p (c) = p(r+1) and p (k) = p(k) for k = c, then this is again almost a principal index map for A (1) , except in that there is no row in A (1) which is equal to e r+1 (or indeed which is even non-zero in column r+1).
Let A (2) be the matrix consisting of the first r columns of A (1) (omitting the final zero column): then A (2) is in principal index form, as it has shape n × r and as e p (c) A (2) = e c for each 1 ≤ c ≤ r. We may isolate the sum over the index x r+1 in Eqn. (17) into a scalar factor, as follows. As Q (1) is symmetric, we can define an r × r symmetric matrix Q (2) , a vector q ∈ Z r , and integer u so that We may then re-express the exponent of the phase in Eqn. (17) as To simplify the parenthesised sum, we must consider two cases: one for u ∈ {1, 3}, and one for u ∈ {0, 2}.
• Suppose u = 2d + 1 for d ∈ {0, 1}. Then for various values of x, the scalar expression in parentheses in Eqn. (20) has the form 1 ± i = √ 2 τ ±1 . Specifically, we have where recall that q x is in principle an integer, but that q x ⊕ d is the reduction of q x + d modulo 2. We may simplify this further by considering how to represent q x ⊕ d, as an expression modulo 4. To do so, we define a binary operation ' * ' for mod 2 matrix multiplication of integer matrices A and B, 10 where A * B is the matrix that results when the coefficients of the matrix product AB are projected to {0, 1}: This allows us to explicitly denote when a matrix multiplication is to be reduced modulo 2, in a context where other arithmetic is not being performed modulo 2. For instance, we have q * x ∈ {0, 1}. We may then expand this into a matrix expression modulo 4, using the fact that y 2 ≡ 0 (mod 4) for y even, and y 2 ≡ 1 (mod 4) for y odd: Then, also using the fact that we have Using this, we then have Then we may rewrite Eqn. (20) to obtain Then for various values of x, the scalar expression in parentheses in Eqn. (20) is either 0 or 2, depending on whether x is orthogonal (mod 2) to q. Because of this, it is of particular interest whether q = 0: we may easily show that this is the case for a simulation of stabiliser operations on normalised states.
-If q = 0 and u = 2, it would follow that in fact |ψ = 0. This vector cannot be represented under the conditions of the expansion matrix A being in principal row form.
-For q = 0 and u = 0, |ψ would instead be super-normalised but otherwise adequately represented by the data Q (2) , A (2) , and b (2) . Note that we could in principle perform further operations to maintain the representation with A in principal row form, though the super-normalisation would not be faithfully represented by the value of the rank r.
As our interest is in the case that the state |ψ represented at the input is normalised, neither of these cases should arise in the course of a stabiliser circuit simulation. 11 We may suppose that a well-defined subroutine simply stops (possibly setting some warning flag) if it discovers that q = 0, which it can test in time O(1) using the sparse data structure for Q.
Given that |ψ is a normalised state, then we have q = 0: using the sparse data structure of Q (1) , we may find an index 1 ≤ ≤ r for which u = Q (1) ,r+1 = 1 in O(1) time. By performing appropriate column operations to the row-vector q , we may reduce it to e ; a further column-swap would allow us to map this row-vector to e r .
That is to say, if we let R = I − e (q − e ) and R = I ⊕ (e ⊕ e r )(e ⊕ e r ) , we have q R R = e r . As both R and R are invertible as integer matrices, so is R := R R . (Note that in fact R = I in the case = r.) Continuing from Eqn. (20), we have where we use Lemma 3 for the last equality. 12 Note that the {0, 1}-matrix A (3) ≡ A (2) R (mod 2) may again not be in principal row form: while the rows p (k) of A (2) will be unaffected by right-multiplication by R for k = , we have e p ( ) A (2) R = e R R ≡ q R , where q may have more than one non-zero coefficient which are merely permuted by the action of R . However, if we define the map is nearly in principal row form (with index map p ), except that it may lack a row with the vector e r . However, we will now see that the r th column of A may be eliminated anyway.
We may simplify the expression in square brackets on the right-hand side, following the analysis of Section 3.4 for expressions of the form of Eqn. (12). In particular, the factor of √ 2 cancels against the factor of 1 √ 2 on the right-hand side of Eqn. (15). Applying the appropriate transformations will yield for suitably defined phase exponent g , Gram matrix Q (4) , and vector b (4) , and where A (4) differs from A (3) by omitting column r. In particular, A (4) has shape n × (r−1), so that the restriction of p to {1, . . . , r−1} is a principal index map for A (4) ; then A := A (4) has rank r − 1.
In each case above, we compute a suitable quadratic form expansion, with a corresponding matrix A in principal row form, and with an accompanying principal index map.

Subroutines
The preceding Sections each motivate simple subroutines, to assist in the transformation and maintenance of quadratic form expansions, in which the matrix A is in principal row form. We summarise these subroutines here, together with their run-times, for use in the simulation procedures for stabiliser circuits. Below, s ≥ 0 is an upper bound on the number of non-zero entries in any row of A; t ≥ 0 is an upper bound on the number of non-zero entries in any column of A; w ≥ 0 is an upper bound on the number of non-zero entries in any row/column of Q; and subscripted versions of these variables (such as s j , t c , etc.) refer to the number of non-zero entries in specific rows or columns of these matrices.

ReindexSwapColumns(k, c)
For integers 1 ≤ c, k ≤ r, update a quadratic form expansion by performing a change of variables corresponding to swapping columns c and k of A. This runs in

ReselectPrincipalRow(j, c)
For integers 1 ≤ c ≤ r and 0 ≤ j ≤ n, attempt to find a row j * = j of A, such that A j * ,c = 0, to serve as a new principal row. If no such row is found, the stop without modifying the quadratic form expansion. If j > 0 is the only row in which column c is non-zero, this halts in time where s j * is the minimum number of non-zero entries in a row j * = j for which A j * ,c = 1.

FixFinalBit(z)
Perform a transformation on the quadratic form expansion, consistent with fixing the value of x r = z, reducing the rank in doing so. This runs in time

ZeroColumnElim(c)
Eliminate (one or two) redundant columns from the matrix A of a quadratic form expansion, given that A has r+1 columns but rank r, and that column c in particular is entirely zero. (This will in general involve other non-trivial changes to A.) This runs in time O(tw + w 2 ) ⊆ O(nr).
Despite the fact that these procedures do not take any explicit arguments which contain the data E = (n, r, g, Q, A, b, p) representing the quadratic form expansion, each procedure described above should be understood to potentially modify some or all of those parameters. The implementations of these subroutines are simple, and are described in pseudocode in Appendix B, along with a run-time analysis for each.

Simulating Clifford operations
In this Section, we describe how to simulate Clifford operations on quadratic form expansions, using the data structures and subroutines described in Section 3. We describe their run-time complexity, in terms which take advantage of bounds on the number of non-zero coefficients in the rows and columns of the Gram matrix Q and the expansion matrix A. We show, independently of any sparsity conditions, that each operation may be simulated in time O(nr): a summary of the run-times of each procedure is provided in Section 4.6. We also show that single-qubit X-, Y -, or Z-basis measurements with deterministic outcomes may be similated in O(1) time.
In the following, we let 0 ≤ s, w ≤ r be (respectively) upper bounds on the number of non-zero entries in any row of A or row/column of Q, 0 ≤ t ≤ n − r + 1 be an upper bound 13 on the number of non-zero entries in any column of A, and subscripted versions of these (such as s j , t k , etc.) represent the number of non-zero entries in particular rows or columns.

Pauli operations
We may easily represent the effect of Pauli operations on quadratic form expansions. To represent an X j operation, we may simply flip the coefficient b j of b, thereby realising X j on each term of the quadratic form expansion. To represent a Z j operation, we use the fact that Z j |z = (−1) z j |z = (−1) e j z . Letting a j = e j A represent the j th row of A, and using the fact that a j x = x diag(a j )x for x ∈ {0, 1} r , we have This allows us to simulate Z j on a quadratic form expansion by for g = g + 4b j and Q = Q + 2 diag(a j ). To represent a Y = iXZ operation, we may simply add 2 to the exponent of τ to incorporate the leading imaginary scalar, and then simulate Z followed by X. We summarise these three transformations by the procedures SimulateX(j) and SimulateZ(j), described in Figure 2. Proof. These follow from the run-time bounds for the procedures SimulateX, SimulateY, and SimulateZ. The run-time bound of O(1) for SimulateX is trivial, and the bound for SimulateY and SimulateZ is governed by the time required to modify the diagonal of Q by adding the non-zero coefficients of row j of A. 13 Because there are r principal rows, each of which is non-zero in exactly one column, each column of A has at least r − 1 zero coefficients in it when r > 0. (For r = 0, there are no non-zero coefficients in A at all.) As a point of interest, we note that as s, w ≤ r, it follows that we may bound st, wt ≤ 1 4 n 2 + 1 2 n + 1 4 .

SimulateX(j)
Simulate the effect of an X j gate.

SimulateZ(j)
Simulate the effect of a Z j gate.

SimulateY(j)
Simulate the effect of a Y j gate.

Hadamard operations
Our techniques to simulate a Hadamard operation on a quadratic form expansion involve modifications to the matrix A. 14 The complexity of these modifications to A depend on its principal row form structure. While the analysis of this operation involves a significant amount of case analysis in principle, we may describe it more simply using the results of Section 3.5.
We represent the Hadamard operator using the commonplace formula for its coefficients, This is itself a essentially a quadratic form expansion, in the terminology of Ref. [12]. The image of the standard basis state |Ax ⊕ b under the operator |z k| j ⊗ I is non-zero, only if k = e j (Ax ⊕ b). If this is the case, the state in the j th tensor factor of |Ax ⊕ b is simply |k ; the effect of the operator |z k| is to replace this with |z . Let K j = I ⊕ e j e j represent the map which acts on vectors and matrices by left-multiplication, to zero out row j. We may then describe the effect of H j on a standard basis state |Ax ⊕ b as where A = K j A and b = K j b. Applying this representation straightforwardly to a quadratic form expansion as in Eqn. (8) yields: We may condense this expression as follows, incorporating the vector x and bit z into a single summation index y = [ x 1 · · · x r z ] . If we extend Q to an (r+1)×(r+1) matrix Q with an additional row and column which is entirely zero, and letã = [ A j,1 · · · A j,r 0 ] be row j of A extended by a further zero coefficient, we have x Qx + 2(e j Ax)z + 2b j z = y Q y + y r+1ã y + y ã y r+1 + 2b j y r+1 = y Q y + y (e r+1ã +ã e r+1 )y + 2b j y r+1 = y Q y , (37) where we define Q := Q + e r+1ã +ã e r+1 + 2b j e r+1 e r+1 . Then, if we let A = A ; e j be the matrix obtained by adjoining the vector e j as an additional (r+1) st column to the matrix A , we have Thus we may simulate the Hadamard by expanding the Gram matrix to an (r+1) × (r+1) matrix with a new row and column computed from b and from row j of A; then zeroing out that row of A; then extending A by a new column consisting of the vector e j . If we were to impose no constraints on the number of columns of the expansion matrix A , this would suffice to produce a representation of H j |ψ . However, it may be that A does not have rank r+1, so that it is not in principal row form as we require for our analysis of measurements. We consider the following cases.
• If j is not a principal row of A, then the matrix A differs from A only in row j, and not in any principal row of A. Extending this matrix to A only adds a further 0 coefficient to each of the principal rows, and sets row j to e r+1 . Then A actually is in principal row form in this case, and we may extend p to obtain a principal index map for A by setting p(r + 1) = j.
• If j = p(c) for some column 1 ≤ c ≤ r, we may reduce our analysis to the case where j is not a principal row -provided that we can choose an alternative row to act as a principal row for column c. We may attempt to do this by invoking ReselectPrincipalRow(j, c). If afterwards p(c) = j, then A has been transformed so that j is not a principal row, and may proceed as above.
If instead ReselectPrincipalRow(j, c) does not change the value of p(c), it must be the case that j is the only row in which column c is non-zero. Then row j of A would be e r+1 rather than e c , and column c would be entirely zero. If we set p (r+1) = j, then A is 'nearly' in principal row form, precisely in the sense considered in Section 3.5: we have e p (k) A = e k for all 1 ≤ k ≤ r+1 so long as k = c, and column c of A is zero. Because Eqn. (38) expresses a normalised state, we also know that the procedure described on page 15 will be able to find the nonzero coefficient in the Gram matrix, if this is needed to produce an expansion matrix in principal row form. 15 Therefore, to update the quadratic form expansion for the operation |ψ = H j |ψ as in Eqn. (38), it then suffices to invoke ZeroColumnElim(c) in this case to obtain an equivalent representation in which the corresponding matrix A has fewer columns, and in particular is full rank. Figure 3 presents a procedure SimulateH, which summarises the remarks above. Steps 1 and 2 serve to check whether j is a principal row, and to attempt to choose an alternative 15 In fact, we can identify the location of one such coefficient analytically. Using the definitions following shortly after Eqn. (35), and recalling in this case that j is the principal row for column c, we have Q er+1 =ã + 2bjer+1 = ec + 2bjer+1. Then Q c,r+1 = Q r+1,c = 1; if we were to swap row c and row r+1 (and similarly for the columns) of Q , we would obtain a matrix Q such that Q c,r+1 = Q r+1,c = 1.

SimulateH(j)
Simulate the effect of an H j gate.
1. If j is a principal row of A, let 1 ≤ c ≤ r be such that j = p(c); otherwise let c = 0.

Modify
A by updating A j,k ← 0 for each 1 ≤ k ≤ r; then extend A by adjoining the column vector e j and update p(r+1) ← j.

Modify
Q by extending by one row and one column, where the extra row is set toã and the extra column is set toã; then update Q r+1,r+1 ← 2b j .
7. If c > 0, call ZeroColumnElim(c); otherwise update r ← r + 1. principal row if necessary; Steps 3-6 represent the transformations described in the analysis above for when either j is not a principal row, or when Step 2 succeeds in choosing a new principal row in place of row j. In the latter case, Step 7 increases the rank parameter r; otherwise, if j was a principal row which could not be re-selected, we invoke ZeroColumnElim(c) as described. Step 4 involves adding a single non-zero entry to a new column of A and assigning p(r+1) ← j: using the sparse data structure for A, this takes time O(1).
Step 5 extends Q by a row and a column, each containing at most s j + 1 non-zero entries, requiring only time O(s j ) using the sparse data structure for Q.
Step 6 takes O(1) operations, and finally Step 7 calls ZeroColumnElim at most once, which requires O(tw + w 2 ) operations. The time complexity is then dominated by Steps 2 and 7, which is therefore O(st + sw + tw + w 2 ); as these run-time costs are only incurred when j is a principal row of A, the time complexity is O(s j ) when j is not a principal row of A.

Diagonal Clifford operations
Without much difficulty, we can show that the operators S and CZ may be simulated on a quadratic form expansion as in Eqn. (8) just by modification of the scalar g and the Gram matrix Q.
We first consider the CZ operation. On an individual standard basis term |z , we have CZ j,k |z = (−1) z j z k |z = (−1) (e j z)(e k z) |z . Let a j = e j A and a k = e k A be the j th and k th rows of A. Then, on a standard basis state |Ax ⊕ b , we obtain: Using the fact that u v = v u for vectors u, v ∈ {0, 1} r , we have 2(a j x)(a k x) = x (a j a k + a k a j )x ; and using the fact that a j x = x diag(a j )x and a k x = x diag(a k )x for x ∈ {0, 1} r , we then have where g := g + 4b j b k and Q := Q + a j a k + a k a j + 2 diag(b k a j + b j a k ) . That is, we can simulate CZ by adding terms to the Gram matrix which may be easily computed from rows of A and coefficients of b; in particular, given that Q is symmetric, Q is as well.
To simulate the effect of S = |0 0| + i |1 1| on a basis state |Ax ⊕ b , the outcome of a matrix multiplication which is evaluated modulo 2 in a ket may come to affect an expression evaluated modulo 4 in an imaginary exponent. To do this, we use the same operation * which we defined in Eqn. (22), apply the formula of Eqn. (23) to express it as a matrix operation modulo 4, and also use the formula of Eqn. (24) for parities of bits. We may then analyse the S gate similarly to the CZ gate. On an individual standard basis term |z , we have S j |z = i z j |z = i (e j z) |z : then, again letting a j = e j A, we have We may then describe the representation of S j on a quadratic form expansion, by where g := g + 2b j and Q := Q + (1 − 2b j )a j a j . We then simulate S j by again adding symmetric terms to the Gram matrix which may be easily computed from rows of A and coefficients of b.

SimulateS(j)
Simulate the effect of an S j gate.

SimulateCZ(j, k)
Simulate the effect of a CZ j,k gate. Proof. In SimulateCZ(j, k), the vectors a j and a k can simply be read from rows j and k of A in Step 1 in time O(s j + s k ). Note that a j a k and a k a j are each non-zero in precisely s j s k entries.
Step 2 then modifies the Gram matrix Q in time O(s j s k ). In Step 3, we call ReduceGramRowCol(h) for those indices 0 ≤ k ≤ r corresponding to non-zero entries either of a j or a k . This has complexity O(w h ) for a maximum of s j + s k values of k, taking O(s j w + s k w) time in total.
Step 4 modifies the value of g, in constant time; the total complexity of O(s j s k + s j w + s k w) then follows. 16

Controlled-NOT operations
The way in which we simulate CX operations, with control qubit h and target qubit j for h = j, depends on whether or not j corresponds to a principal row of A. In the case that it does not, we may simulate it in a straightforward way, by noting that on standard basis states we have where E j,h = I + e j e h acts on vectors via left-multiplication by adding row h into row j (and similarly for matrices). Thus we have where A = E j,h A is obtained from A by adding row h into row j modulo 2, and b 16 The run-times of SimulateS(j) and of SimulateCZ(j, k) can be easily be sharpened to O(s 2 j ) and O(sjs k ) respectively, by modifying them to only reduce the entries where Q and Q differ. We use the definitions presented in Figure 4 to simplify the overall presentation of our results.

SimulateCX(h, j)
Simulate the effect of a CX h,j gate. If j = p(c) for some column c of A, then A will not be in principal row form unless row h of A happened to be entirely zero. However, as E h,j is invertible and A is full rank, then A will also be full rank; and it will only fail to be in principal row form in that there is no row which contains e c . Precisely because A is full rank, column c of A will not be entirely zero -in particular, either A j,c = 1 (as in the case that A h,c = 0), or row h is itself not a principal row and A h,c = 1. In either case, there is at least one row 1 ≤ j * ≤ n, for which row j * of A could be made into a principal row for column c (yielding a matrix A in principal row form) by suitable column operations, without disturbing the other principal rows. We may find such a row j * , and perform the appropriate change of variables, by simply invoking the subroutine ReselectPrincipalRow(0, c) -in particular, setting the first argument to 0 to allow the possibility of selecting j * = j if this is the only row (or the row with the smallest number of non-zero entries) such that A j * ,c = 1. Figure 5 presents a procedure SimulateCX(h, j) which summarises the above analysis. Using the run-time bound for ReselectPrincipalRow described in in Lemma 21, we may easily show the following: Proof. We may perform Step 1 in time O(1). Step 2 involves iterating over the s h indices 0 ≤ k ≤ r for which A h,k = 0, performing O(1) operations on each iteration; and Step 3 can also be performed in time O(1). In Step 4, which is only invoked if j was a principal row of the expansion matrix A at the input, we invoke ReselectPrincipalRow(0, c), which runs in time O((s h + 1)t + (s h + 1)w), using the fact that the number of non-zero entries in row j at this step is at most s h + 1.

Pauli basis measurements
As quadratic form expansions emphasise the decomposition of a state in the standard basis, this leads to simpler procedures to simulate Z-basis measurements compared to X-or Ybasis measurements: Z-basis measurements may only decrease the number of columns of A, whereas X-and Y -basis measurements have an analysis which is closer to that of the Hadamard operation. However, these operations all can be done particularly quickly when the qubit being measured is disentangled from the other qubits -for instance, in those cases where the measurement outcome is in principle deterministic.
We note that, in principle, one may simulate an X-or a Y-basis measurement by performing a suitable single-qubit unitary U (respectively: U = H or U = S † H), followed by a Z-measurement, followed by U † . If our aim were only to demonstrate that these operations could be simulated in O(n 2 ) time, this would suffice. However, one might wish to simulate these measurement operations without necessarily reducing them to Z-basis measurements, if avoiding that reduction could provide a savings in operations. We find that this is possible (see Lemma 10 below), using an analysis which builds on the similarity between simulating these measurements and simulating the Hadamard transformation. For this reason, we treat each of the Pauli measurements on an equal footing.

Simulating measurements with deterministic outcomes
As we maintain the expansion matrix A in principal row form, none of the terms in Eqn. (8) may cancel. Then, a Z-basis measurement on qubit j is deterministic if qubit j is in the some particular state |β (for a bit β ∈ {0, 1}) in every term of the quadratic form expansion -and in particular, does not depend on any bit of the summation index x. This occurs if and only if row j of A is entirely zero, which can be determined in time O (1). If this is the case, the outcome will be β = b j , which again can be computed in time O(1).
Remarkably, we may show that X-and Y -basis measurements with deterministic outcomes can also be simulated in time O(1). Consider a state |ψ , in which either an X-basis measurement or a Y -basis measurement on some qubit j would have a determinstic outcome. Such measurements cannot be deterministic if qubit j is in a Z-eigenbasis state, so they can be deterministic only if row j of A has a non-zero entry. Furthermore, the outcome of such a measurement can only be deterministic if there is no entanglement between qubit j and any other qubits. In particular: • There can be no correlations between the outcomes of Z-basis measurements on qubit j and any other qubits. This implies that it must be possible to reindex the sum by column operations on A, so that e j is a column of A: that is, there must be a solution to the set of equations e j = Av for some v ∈ {0, 1} r . However, for A in principal row form, row p(k) of Av will be non-zero for any k such that v k = 1. It follows that e j = Av is only solvable if v has exactly one non-zero entry: that is, if e j is itself a column of A. Furthermore, as only row j of A is non-zero in column c, this implies that row j is a principal row with j = p(c).
We may determine whether this is the case in constant time, by determining whether row j of A has exactly one non-zero entry, finding the column c in which that nonzero entry occurs, and then checking whether column c has exactly one non-zero entry.
• Given that the above holds, the state of qubit j in each term of the quadratic form expansion is |x c for some 1 ≤ c ≤ r, and only other variables x k for k = c are involved in the state of the other qubits. For qubit j to be unentangled from the others, the relative phases of the terms in the expansion must be a product of the relative phases for the state of qubit j, and the relative phases for the state of the other qubits, with no contribution from cross-terms x c x k . By Lemma 2, this is equivalent the off-diagonal coefficients in row/column c of Q all being even.
As our subroutines all evaluate the off-diagonal terms modulo 2, it suffices to check whether all coefficients in row/column c of Q are zero, except possibly Q c,c . We may determine this in constant time by checking whether row c of Q has at most one non-zero coefficient, and (if one such coefficient exists) whether that coefficient is on the diagonal.

SimulateMeasZ(j)
Simulate a Z-basis measurement on qubit j, storing the result in a bit β produced as output.
1. If row j of A is zero, set β ← b j and stop; otherwise set β to a uniformly random bit-value and proceed.
2. Find 1 ≤ k ≤ r such that A j,k = 0, minimising the number of non-zero entries in column k of A.
3. Call ReindexSwapColumns(k, r). MakePrincipal(r, j). Together, these conditions suffice to show that qubit j is in one of the states |+ , |+i , |-, or |-i . Which of these states it is in, is determined by whether (−1) b j Q c,c = 0, 1, 2, or 3 respectively (corresponding to a relative phase of +1 = i 0 , or i = i 1 , or −1 = i 2 , or −i = i 3 ); this may also be determined in time O(1). By construction, much of this analysis generalises to the case where the qubit to be measured is in a single-qubit stabiliser state, unentangled with any others, even if the measurement outcome is not deterministic: it will be in an eigenstate of some Pauli operator {X j , Y j , Z j }. Whether this is the case may be determined in time O(1); a random outcome and an update to the quadratic form expansion can then be computed in constant time as well.

Simulating Z -basis measurements
To describe the effect of a measurement with a random outcome β, we perform a change of variables so that row j is a principal row -and so that in particular, the value of qubit j in any term in the superposition depends only on the bit x r . (We may attempt to do so in a way to minimise the amount of work required.) We then fix the value of that particular bit, using the subroutine FixFinalBit(β ⊕ b j ) from Section 3.4 -where we take the argument β ⊕ b j to over-write the value of b j with β. Figure 6 presents an algorithm SimulateMeasZ(j), which supplements the analysis of deterministic Z-basis measurements with the operations described above. Proof.
Step 1 takes O(1) operations: this is all that is required in the deterministic case.
Step 2 iterates over the non-zero entries of row j of A, performing simple integer comparisons and assignments in each iteration, taking O(s j ) operations in total. Steps 3, 4, and 5 invoke the subroutines ReindexSwapColumns, MakePrincipal, and FixFinalBit, using O(t + w), O(s j t + s j w) and O(t + w) operations respectively. The total complexity is then dominated by Step 4, which has run-time O(s j t + s j w).

X -and Y -basis measurements with random outcomes
For the sake of convenience, we define |x β = 1 √ 2 |0 + i 2β |1 and |y β = 1 √ 2 |0 + i 2β+1 |1 as notation for the eigenstates of X and Y respectively, where β ∈ {0, 1}. To simulate an X-or Y -basis measurement on qubit j which has a random outcome, we may generate a measurement outcome β ∈ {0, 1} uniformly at random, and then determine the effect of applying the projectors Let |ψ (X) β = √ 2 |x β x w | j ⊗ I |ψ denote the state after applying an X-basis measurement on qubit j of |ψ , given that the outcome β ∈ {0, 1} is not deterministic (where the factor of √ 2 is to renormalise the state); similarly, let |ψ (Y ) β = √ 2 |y β y β | j ⊗ I |ψ denote the state after applying an Y -basis measurement on qubit j of |ψ , given that the outcome β ∈ {0, 1} is not deterministic. Following a similar analysis as for the Hadamard: if we let K j = I ⊕ e j e j , and define A = K j A and b = K j b, we may show using the operation * as defined in Eqn. (22), and applying the formula u ⊕ v = u + v − 2uv to coefficient j of (A * x ⊕ b) in the imaginary exponent. Let a = e j A: we may then simplify the imaginary exponents in Eqns. (46), and in particular factor out constant terms which only contribute global phase factors, to obtain We may again condense the indices of summation by defining y = [ x 1 · · · x r z ] , and letting A = A ; e j be the matrix obtained by adjoining the vector e j as an additional (r+1) st column to the matrix A . Letã = [ A j,1 · · · A j,r 0 ] be an extension of a by a further zero coefficient, and let Q be an (r+1) × (r+1) matrix which extends Q with an additional row and column which is entirely zero. We then define Gram matrices so that we may express the imaginary exponents in Eqns. (47) as x Qx + (2β +1)z+(2β +1)(2b j −1)(a * x) Then, if we let g (X) := g − 4βb j and g (Y ) := g − (4β + 2)b j , we may re-express Eqns. (47) as (50) As with simulating the Hadamard, A may not be in principal row form. Furthermore, as A is constructed in the same way as in the analysis of the Hadamard operation, we may treat this possibility in exactly the same way: • If j is not a principal row of A, we may apply the above analysis without modifications, and extend p to obtain a principal index map for A by setting p(r+1) = j.
• If j = p(c) for some column 1 ≤ c ≤ r, we may attempt to reduce to the case where j is not a principal row, by invoking ReselectPrincipalRow(j, c). If afterwards p(c) = j, then the expansion A has been transformed so that j is not a principal row, and may proceed as above. Otherwise, if ReselectPrincipalRow(j, c) does not change the value of p(c), it must be the case that j is the only row in which some column c of A is non-zero, row j of A would be e r+1 rather than e c , and column c would be entirely zero. We may then setting p (r + 1) = j, and invoke ZeroColumnElim(c) to transform the quadratic form expansion into a form where the corresponding matrix A is in principal row form. Figure 7 presents algorithms SimulateMeasY(j) and SimulateMeasX(j), which supplement the analysis of deterministic measurements in Section 4.5.1 with the operations described in the analysis above. The treatment of deterministic measurements is in each case captured by the first three steps, which determines whether j is a principal row j = p(c), attempts to reselect it if so (thereby testing whether the column c has more than one non-zero entry), and then tests whether the conditions for qubit j to be unentangled from the others. Apart from Step 3, and the modifications made to the Gram matrix Q, both procedures are essentially identical to each other and to SimulateH(j) as presented in Figure 3. Simulate an X-basis measurement on qubit j, storing the result in a bit β produced as output.
1. If j is a principal row of A, let 1 ≤ c ≤ r be such that j = p(c); otherwise let c = 0.

Modify
A by updating A j,k ← 0 for each 1 ≤ k ≤ r; then extend A by adjoining the column vector e j and update p(r+1) ← j.
6. Modify Q by extending by one row and one column, initially set to zero.

SimulateMeasY(j)
Simulate a Y -basis measurement on qubit j, storing the result in a bit β produced as output.
1. If j is a principal row of A, let 1 ≤ c ≤ r be such that j = p(c); otherwise let c = 0.

Modify
A by updating A j,k ← 0 for each 1 ≤ k ≤ r; then extend A by adjoining the column vector e j and update p(r+1) ← j.
6. Modify Q by extending by one row and one column, initially set to zero.
10. If c > 0, call ZeroColumnElim(c); otherwise update r ← r + 1.  • SimulateMeasX(j) checks whether the outcome is deterministic, which occurs when Q c,c = 2u for u ∈ {0, 1}. If so, it sets β = u as the measurement outcome; otherwise it selects a random measurement outcome, and updates Q c,c ← 2β to represent the corresponding post-measurement state.
• SimulateMeasY(j) checks whether the outcome is deterministic, which occurs when Q c,c = 2u + 1 for u ∈ {0, 1}. If so, it sets β = u as the measurement outcome; otherwise it selects a random measurement outcome, and updates Q c,c ← 2β + 1 to represent the corresponding post-measurement state.
In each case above, the procedures then stop, having taken O(1) time. If instead Q does have non-zero off-diagonal elements, both procedures instead select a random measurement outcome β, and the procedure continues.
In Step 4, we initialise a vectorã from row j of A, taking time O(s j ); Step 5 then modifies row j of A by setting all of its entries to 0, except for an entry 1 in a new (r + 1) st column, which may be done at the same time as Step 4. In Step 6, both procedures extend Q by an additional row and column of zeroes, which may in principle be done without any modification of the sparse data structure storing Q. Then Steps 7 and 8 realise different updates to the Gram matrix according to the measurement type: • SimulateMeasX(j) adds up to s j + 1 entries to the diagonal, and then reduces them modulo 4, taking time O(s j ); • SimulateMeasX(j) adds two terms to Q, together having precisely s 2 j + 1 non-zero entries. It then invokes ReduceGramRowCol(k) on all rows 1 ≤ k ≤ r in which those terms have non-zero entries, taking time O(s j w).
After this, Step 9 takes time O(1). If row j was not initially a principal row of A, or if Step 2 succeeded in selecting a new principal row, then Step 10 simply updates r in time O(1); otherwise, column c is now zero and must be removed using ZeroColumnElim(c), taking time O(tw + w 2 ).
In the above, if j was not a principal column, the two procedures SimulateMeasX(j) and SimulateMeasX(j) respectively take time O(s j + 1) and O(s 2 j + s j w + 1). Note that, if row j of the initial expansion matrix is entirely zero, this is O(1) in both cases, matching the run-time for the case that qubit j is initially in an X-or Y -basis state. If instead j is a principal row of the original expansion matrix A, and qubit j is entangled with some other qubits in |ψ , the run-times of these subroutines takes an additional amount of time which is domiated by Step 10, with respective totals of O(s j + tw + w 2 ) and O(s 2 j + s j w + tw + w 2 ).

Summary of simulation complexities of stabiliser operations
Ignoring refinements which are possible when the expansion matrix A or Gram matrix Q are sparse, we may summarise the run-time bounds for the subroutines presented above as follows: In each case, r is bounded above by n as a result of maintaining the principal row form. If we abandon this requirement, for example to allow controlled-NOT gates to be universally simulatable in time Θ(r), then we must do more work when performing a Hadamard or X-or Y -basis measurement (potentially up to O(n 3 ) to perform Gaussian elimination).

Simulation complexity of composite procedures
We now prove a number of results regarding the complexity of simulating procedures consisting of multiple stabiliser operations.

Simulating stabiliser circuits in general
The above Lemmata allow us to show the following result, which is the most general summary regarding the effectiveness of our techniques for weak simulation of stabiliser circuits: This asymptotic result in principle matches the worst-case performance of each of the stabiliser formalism [1], graph-state-based representations [2], and the phase-sensitive Clifford simulator of Bravyi et al [11]. While each of those other techniques have faster asymptotic performance for certain operations, the worst-case performance described in Theorem 11 also obscures faster performance of different operations under various conditions of sparsity.

Theorem 11. A stabiliser circuit consisting of M stabiliser operations from the set of operations described in Section 2.1, whose initial state is expressed as a quadratic form expansion with an expansion matrix
In common with any weak simulation technique, the result above may easily be 'upgraded' in certain cases to a result for strong simulation (in which we are interested in the probability of obtaining certain outcomes for a subset of the measurement operations). Specifically: suppose that we are interested in the probability that some k of the measurements yield an outcome β ∈ {0, 1} k . Let us say that a measurement on a qubit j is 'of interest' if it is one of the measurements whose outcomes we consider. We may recursively define the 'history' of such a measurement -representing a sort of 'causal past' of the measurement -as: (i ) any operation which is performed on the measured qubit j, prior to that measurement of interest, or (ii ) an operation (possibly classically controlled) which acts on some other qubit j , before an operation which also acts on j and is also in the history of the measurement of interest on j; (iii ) a measurement operation whose outcome is used as a classical control, for an operation in the history of the measurement of interest on j.
If every measurement in the history of some measurement of interest, is itself a measurement of interest, we may compute the probability of the outcome β by sequentially computing the probability that each measurement of interest produces a particular outcome. Without imposing this constraint, the strong simulation problem may in some cases be tractible, but in general is as difficult as strong simulation of a Hadamard+Toffoli circuit, which is #P-hard [22, Theorem 2].

Improved simulation of multiple 'terminal' measurements
Quadratic form expansions allow for more efficient techniques to strongly simulate 'terminal' measurements: that is, measurements on distinct qubits, which are the last measurements of interest to be performed on a state (so that the post-measurement state is not needed to compute any other outcomes). For the sake of simplicity, we may consider the case where all of the measurements to be performed are Z-basis measurements, by reducing X-basis and Y -basis measurements to Z-basis measurements preceded (and followed) by appropriate single-qubit unitaries. We may consider improvements in the run-time for strong simulation of the measurement stage itself, by reduction to solving a system of equations involving the expansion matrix A (or a sub-matrix of A) for the state just prior to measurement. Eqn. (8), on which we perform 0 ≤ k ≤ n measurements in the Z-basis on distinct qubits. We may determine the probability of the outcomes of those measurements yielding a particular string β ∈ {0, 1} k in time O(k 2 n).

Theorem 12. Let |ψ be represented by a quadratic form expansion as in
Proof. Consider the sub-matrix A of A, obtained by restricting to the rows of A corresponding to the qubits whose outcomes we are interested in. This gives rise to a system of k equations in r unknowns A x = (β ⊕ b), which may be solved by Gaussian elimination in time O(k 2 r) ⊆ O(k 2 n). If this system of equations is unsatisfiable, then β is not a possible measurement outcome. Otherwise, the probability with which β occurs is given by 2 −r , where r ≥ 0 is the rank of A (i.e., the number of bits required to determine the outcome β among the possible outcomes on the measured qubits).
We would often expect better performance than O(k 2 n) for strong simulation of the measurements in practise. An elementary observation is that Gaussian elimination in fact takes time O(k 2 r), for the rank parameter r which is itself merely bounded by n; and in fact this may be easily sharpened to O(κkr), where κ = min {k, r}, as κ is a bound on the rank of A . Furthermore, in the case k ≈ n, we would expect the problem to become simpler due to the involvement of principal rows corresponding to some of the measured qubits. The k qubits whose measurement outcomes we are interested in, will include some number 0 ≤ k p ≤ k of qubits corresponding to principal rows of the expansion matrix A. The presence of such rows may be used to speed up the computation of a reduced row-echelon form for A , as the corresponding rows of A have only one non-zero entry to account for in row reduction. 17 The complexity in this case would then be O(k p k +κkr), wherek = k − k p ,r = r − k p , andκ = min {k,r}. When k p = k (i.e., all of the measured qubits correspond to principal rows of A) or when k p = r ≤ k (i.e., when all of the principal rows of A correspond to measured qubits), this leads toκ = 0, providing a potentially significant drop in run-time complexity. In particular: Theorem 13. Let |ψ be represented by a quadratic form expansion as in Eqn. (8), on which we measure all but 0 ≤ ≤ n of the qubits in the Z-basis. We may determine the probability of the outcomes of those measurements yielding a particular string β ∈ {0, 1} n− in time O(n 2 + 2 n).
Proof. Following the analysis above, take k = n− and k p ≥ r− , so thatk = k−k p ≤ n−r, andr = r − k p ≤ . Thenκ ≤ as well. By definition, we have k p k ≤ (n − )r = nr − r; then strong simulation of n − qubits can be done in time O(k p k +κkr) = O(nr − r + 2 n − 2 r) ⊆ O(n 2 + 2 n).
Thus, for circuits with a final round of k parallel Z-basis measurements where k ∈ O(1) or k = n − O( √ n), we may perform a strong simulation of these measurements with an amortised cost of O(n) per qubit, whether or not the outcome is deterministic. 18 In this spirit, we add a further observation on performing a round of near-total measurement, for weak simulation: Proof. For each of the k p columns c, for which the principal row j = p(c) corresponds to a measured qubit, we swap column c with one of the last k p columns in order, using ReindexSwapColumns. Doing this for k p such columns will take time O(k p n) in total.
Having done so, we may simulate the measurements on the corresponding principal rows by selecting measurement outcomes at random and performing FixFinalBit a total of k p times, with run-time cost O(k p n) again. This leaves us with a quadratic form expansion withr = r −k p columns: we may simulate the finalk = k −k p measurements by repeatedly calling SimulateMeasZ, taking time O(kr n). The total run-time of this procedure is then O(k p n +kr n). If k = n − and k p ≥ r − , we then havek ≤ n − r andr ≤ , so that k p n ≤ n 2 andkr n ≤ n 2 .
Thus, for ∈ O(1), we have an amortised cost of O(n) for weak simulation of n − parallel Z-basis measurements, regardless of whether the outcomes are deterministic; more generally, for any 0 < n, we have an amortised cost of O( n) for parallel Z-basis measurements.

Simulation of stabiliser measurements
A particularly useful feature of our techniques is that any single-qubit measurement whose outcome is in principle deterministic, can be simulated in constant time. This is potentially significant for an important use-case of stabiliser circuit simulation: prototyping and simulating error correction procedures under the influence of a Pauli noise model.
The most prominent error correction procedures are stabiliser codes, whose syndrome measurement procedures correspond to measuring multi-qubit Pauli operators as observables, e.g., measuring Z⊗Z⊗Z⊗Z as an observable. In practice, such observable measurements would be performed indirectly, by interacting a 'syndrome qubit' with some k > 1 'code' qubits using Clifford operations, and then measuring the data qubit. These syndrome qubits are initially prepared independently of the other qubits in the computation. Using a quadratic form expansion representation, these syndrome qubits would therefore correspond to rows and columns of A and Q which have a constant number of non-zero elements. By taking account of this fact, we may show that the syndrome measurements could be simulated in time O(kn) on an arbitrary state of the rest of the system (regardless of whether the other rows and columns of A and Q are sparse): Theorem 15. Let |ψ be represented by a quadratic form expansion as in Eqn. (8), which contains at least one 'syndrome' qubit, disentangled from the others, which is set aside for the purpose of facilitating multi-qubit measurements. Suppose that |ψ is a ±1-eigenvector of some multi-qubit Pauli operator P acting on 1 ≤ k ≤ n qubits. Then the outcome of a P -measurement on |ψ can be computed in time O(kn).
We prove this below. In fact, we will demonstrate something stronger: (i ) that a Pmeasurement may be simulated in O(kn) time, but also possibly in O(kr) time or O(k) time in situations which would be easy to identify; and (ii ) that this remains true for simulations of certain fault-tolerant procedures to realise a P -measurement. In a stabiliser code such as surface or colour codes [9,24,25] -in which the stabilisers to be measured are 'local', in the sense that k is bounded by some constant -the above run-time bounds may be tightened further to O(n), O(r), and O(1). Figure 8 demonstrates some typical presentations of syndrome measurement procedures, for a Pauli observable X⊗Y ⊗Z⊗X selected as an example. This measurement is simulated using controlled-X, controlled-Y , and controlled-Z operations to realise phasekicks onto one or more syndrome qubits. (We may decompose a controlled-Y operation as CZ for the purposes of simulation.) These operations involve the syndrome qubits as controls: simulating these operations therefore do not affect the number of non-zero elements in the rows of A which correspond to these syndrome qubits, which thus remain O(1) throughout. Using the analysis of the simulation runtimes of these operations in terms of sparsity parameters, we may then bound the time to simulate an operation CZ a,j by O(s j +w) ⊆ O(r), and to simulate an operation CX a,j by O(t) ⊆ O(n). (These operations may in fact be simulatable in time O(1), depending on whether row j of A happens to be a principal row in the case of CZ a,j operations, or whether row j happens not to be a principal row in the case of CX a,j .) Because S a can be simulated in time O(1) in this case, CY a,j operations can then be simulated in time O(s j + t) ⊆ O(n). Furthermore, the measurements on these qubits each have either . This procedure may be elaborated with a flag qubit [26] to make it fault-tolerant, adding only a constant amount of work for the simulation.
(b) A procedure to measure the syndrome bit using Shor-style syndrome extraction [27,28]. Multiple syndrome qubits are initially prepared in a 'cat' state, involving a single principle row. For example, the preparation of this state could be simulated using non-principal rows representing qubits initially prepared in the state |0 for all but one syndrome qubit; these qubits are entangled with a final qubit prepared in the state |+ , corresponding to a principal row. This preparation time can be simulated in time O(1), as they involve only a constant number of qubits independent of the larger system. We interact these syndrome qubits with the data qubits to realise a distributed eigenvalue estimation procedure: each such interaction may be simulated in time O(1), O(r), or O(n). All but the last of the X-basis measurements involve a non-principal row with only one non-zero element, and so can be simulated in time O(1); the final syndrome measurement produces the syndrome bit, and is therefore a deterministic measurement also simulated in time O(1). If the simulation of the cat state involves Pauli noise, this noise may be countered with further parity checks [21,27] or distillation [29][30][31], each round of which may also be simulated in time O(1) using the same techniques as above. a deterministic outcome, or may be simulated without any reselection of principal rows. As a result, each of the measurements of the syndrome qubits may be simulated in time O(1). The syndrome measurement procedure as a whole then takes time O(kn), dominated by the two-qubit operations involved; and this bound may be loose, depending on the particular Pauli operation being measured and the qubits on which they are being measured.
Proof of Theorem 15. We prove the result for the non-fault-tolerant technique to realise P -measurements by a phase kick with a single syndrome qubit (as illustrated in Figure 8a): similar remarks may be applied to other techniques. Let a be the dedicated 'syndrome' qubit. 19 In O(1) time, we may simulate a procedure which prepares a in the state |+ .
Each of the two-qubit measurements involved in simulating the measurement of P may be simulated in time O(n) by the analysis above; and the final measurement on a itself may be simulated in time O(1), by hypothesis that |ψ is an eigenvector of P . The total run-time is then dominated by the time required to simulate the k two-qubit operations, which is O(kn).
In the case that k is bounded above by a constant, the above allows us to match the asymptotic complexity obtained by Gidney [4], of O(n) to simulate syndrome measurement under a Pauli noise model. Our techniques allow us to do so without needing to record the circuit being simulated (in order to simulate the stabiliser measurement in the Heisenburg picture).
We note that for sufficiently large values of k, our techniques may be expected to be slower in practise than those of Ref. [4], for simulating deterministic Pauli stabiliser measurements as such. This is a result of the fact that our techniques rely on two-qubit unitaries to represent a particular physical realisation of a k-qubit Pauli observable mearsurement. We note that such a realisation is necessary for a sufficiently detailed simulation of a physical procedure to realise such P -measurements. The techniques described by Gidney [4] would also yield O(kn) run-times to simulate certain procedures for stabiliser measurements, such as that in Figure 8a; for any which produce non-deterministic measurement outcomes, such as that represented in Figure 8b, the run-time required by those techniques appear to scale as O(kn 2 ). In this sense, our techniques seem to provide an advantage for simulating proedures which make frequent syndrome measurements in the presence of Pauli noise.
It is plausible that qubits in an error correcting code will have enough structure in their entanglement relations to allow for a relatively sparse representation, allowing for faster simulation than is represented by the unconditional bound of O(n). Because of the importance of syndrome measurement to simulation of error corrected architectures, we conjecture that this is an application for which our techniques will be particularly well-suited.

Discussion
In this paper, we have developed on the notion of a quadratic form expansion, as described in Ref. [12] and informed by presentation of similar path-sum like representations of stabiliser states and stabiliser circuits [13][14][15][16][17]. Procedures to efficiently simulate oneor two-qubit stabiliser circuits on n-qubit stabiliser states using such a representation, are implicit in many of these works. We have presented explicit procedures to simulate such operations in time O(n 2 ), matching the worst-case asymptotic complexity achievable under the stabiliser formalism [1] among others [2,11]. We obtain this result by considering quadratic form expansions which are subject to certain constraints: notably, involving an 'expansion matrix' A in principal row form. Furthermore, the bound of O(n 2 ) is in some cases loose. As we describe in Section 4.6, when the state has 2 r standard basis components for r n, the worst-case complexity to simulate a stabiliser operation is O(nr). For each stabiliser operation, we present still more refined bounds on simulation complexity, when A or the quadratic form can be represented by sparse data structures.
We briefly consider the way in which our techniques fail to extend to Clifford+T circuits (i.e., including a T = √ S = diag(1, τ ) gate for τ = √ i), or Clifford-plus-controlled-S circuits. Consider a representation of states similar to Eqn (8), in which the relative phases are expressed as powers of τ instead of powers of i. This would complicate the analysis of the simulation of the Hadamard operation (or more precisely the procedure analogous to ZeroColumnElim), requiring the analysis of a new case in which some entry of the Gram matrix Q represents an odd power of τ , to extend the analysis of Eqn. (20). There is no algebraic 'coincidence' regarding 1+τ , which is analogous to the equality 1+i = √ 2 · τ , which would allow us to reduce the rank r to simplify a quadratic form expansion involving relative phases which are powers of τ ; more sophisticated (and computationally demanding) techniques would be required. In the case of controlled-S gates, any attempt to simulate them directly on a state as in Eqn. (8) would require that we abandon the constraint that Q is symmetric; however, this complicates the analysis of Eqns.(18)- (20) in ZeroColumnElim as well, as well as any analysis involving a change of variables (as Lemma 3 relies on the symmetry of Q). The existence of such obstacles, is no surprise: the ability to efficiently simulate either of these circuit classes, suffices to simulate arbitrary quantum computations with bounded error [21,32]. However, extensions of our techniques in this direction could yield modest reductions to the simulation complexity of more complex quantum procedures.
We note that the sparsity of the expansion matrix A and the Gram matrix Q (at least in certain rows and columns) is crucial to the usefulness of our techniques for any given application. For a random stabiliser circuit -in which the probability of any one of a Hadamard gate, X measurement, or Y measurement is bounded below by a constant -one may expect the value of the rank parameter r above to approach 1 2 n, and then vary only slightly from this value on average. In this case of r ∈ Θ(n), the bound O(nr) = O(n 2 ) for our techniques to simulate various unitary gates, compares poorly to other simulation techniques [1,2]. However, as r increases, the number of principal rows of A -each of which has exactly one non-zero entry -also increases. By selecting certain qubits to correspond to principal rows of A, one may in some cases achieve significant improvements in simulation time for certain procedures, as we describe in Section 5.3. For structured stabiliser circuits as opposed to randomly generated ones, this may yield significant advantages for simulation. We expect this may be the case for simulations of stabiliser circuits which realise operations involving certain error correction codes.
The run-times of our techniques are broadly similar to those described by Anders and Briegel [2], who describe stabiliser states as the image of graph states |G [33,34] under a tensor product of local Clifford operations (essentially products of H and S). Note that our techniques represent the state |+ ⊗n using a rank of r = n, and expansion matrix A = I n , and a Gram matrix Q = 0. From the way that the Gram matrix Q updates under CZ operations with our techniques, it follows that a quadratic form expansion for a graph state |G is essentially to set r = n, A = I n , and to set Q to the adjacency matrix of G. The degree parameters of Ref. [2] then coincide with the sparsity parameters w for the Gram matrix Q. Quadratic form expansions of the sort of Eqn. (8) may then be considered the image of a graph state under additional S operations, followed by a 'CNOT circuit' -which is to say, a linear isometry of the form |x → |Ax . The principal advantage of our techniques over those of Ref. [2] is in the use of the that isometry, represented by the expansion matrix A, to represent correlations between Z-basis measurement outcomes. This provides us with improved simulation of parallel Z-basis measurements as in Section 5.2, and a way to try to systematically reduce the complexity of operations in structured circuits as suggested by the results of Section 5.3.
Our techniques have certain advantages and disadvantages compared to simulation with the stabiliser formalism, and its elaborations in Refs. [1,4,11]. Those techniques uniformly require O(n) time to simulate unitary operations such as S, H, CZ, and CX, whereas in the case r ∈ Θ(n), our techniques frequently require Θ(n 2 ). Our techniques are no worse than the stabiliser formalism for simulating measurements, as stabiliser-based techniques require time O(n 2 ) in general to perform a weak simulation of a single measurement. More notably, in the case of strong simulation of a few qubits (or of nearly all of the qubits) in the Z-basis, our techniques provide asymptotic improvements over the stabiliser formalism for Z-basis measurements with random outcomes. Furthermore, whereas Gidney's refinement [4] enables O(n)-time simulation of multi-qubit measurements with deterministic outcomes, our techniques can simulate single-qubit measurements with deterministic outcomes in time O(1). As we describe in Section 5.3, our techniques also extends to O(n)-time simulation of fault-tolerant syndrome measurement of 'local' Pauli operators P under Pauli noise models including ones which yield non-deterministic measurement outcomes as part of the process. Thus, our techniques may prove more practical for simulations in which such measurements are frequent. We speculate that the setting of operations on error-corrected qubits may also have enough additional structure to allow simulation using sparse data structures: this may provide further opportunities to improve the simulation complexity of these circuits, using our techniques.
Quadratic form expansions could possibly be regarded as less general than the representation for stabiliser states presented by Bravyi et al. [11]. The sense in which this is the case is as follows. Ref. [11] describes a representation of stabiliser states in the form where ω ∈ C is a scalar, b ∈ {0, 1} n , U H is a unitary realisable by Hadamard operations applied to some of the qubits, and U C is an operator such that U C |0 ⊗n = |0 ⊗n . In our work, the choice of 0 ≤ r ≤ n and an initial expansion matrix of only principal rows and zero rows corresponds to U H ; then the choice of Gram matrix and expansion matrix can be described by an operator U C . By using a representation for U C using the stabiliser formalism, Ref. [11] is able to simulate one-and two-qubit Clifford operations in time O(n). Our techniques are better suited to cases where the expansion matrix A or Gram matrix Q are expected to be close to O( √ n)-sparse, so that unitary gates may be simulated in time O(n), and where measurements are frequent enough that the techniques of Sections 5.2 and 5.3 provide an advantage over the O(n 2 )-time simulation time using the techniques of Ref. [11].
Guan and Regan [20] describe results in strong circuit simulation, which also makes use of what we call a quadratic form expansion, albeit allowing a summation index y ∈ {0, 1} h for h possibly much larger than n, and using a different approach to that used in our analysis to navigate mixed-modulus arithmetic in the quadratic form. Motivated by the connection between computing binary matrix rank and strong simulation, they describe techniques to relate the problem of evaluating p = | 00 · · · 0| C |00 · · · 0 | 2 for a Clifford circuit C (consisting of only CZ gates, S gates, and Hadamard gates) to transformations of the quadratic form. This allows them to compute such probabilities p in time O(M + n+M ω h ), where n the number of qubits on which C acts, M is the number of Clifford gates in C, and M h is specifically the number of Hadamard gates (which contribute to the length h of the summation index y). These results rely on allowing h to grow without an upper bound of n. We may contrast this run-time for strong simulation of a 'total' measurement, to that of Theorem 13, which yields an upper bound of O(M n 2 ) to evaluate p (dominated by the time to construct a quadratic form expansion for the pre-measurement state). The results of Ref. [20] are advantageous in the setting that M ω h M n 2 . If we consider the case M h /M ∈ Θ(1) in which the Hadamard gates make up a significant fraction of the total number of gates in C, this is equivalent to a bound of M ∈ O(n 2/(ω−1) ) on the circuit size. For the known bound ω < 2.3729, this implies that the techniques of Ref. [20] scale asymptotically similarly to ours (or better) to compute p for circuits of size about O(n 1.4568 ).
There is potential for our techniques involving quadratic form expansions, to work well in conjunction with techniques for quantum circuit minimisation or verification. For instance, consider a Clifford circuit expressing an n-qubit unitary U . By computing a quadratic form expansion for the 2n-qubit 'Choi' state (U ⊗ I) |Φ n where |Φ n = 1 √ 2 n x |x |x , we may determine whether or not U is in fact the identity operation. This observation may also be easily extended to circuits with measurement operations, provided that we consider the transformation realised for specific measurement outcomes. In this respect, our techniques may be used as an alternative realisation of a subset of the techniques of Amy [17] to verify quantum circuit equalities. At the same time, our representation of stabiliser states have a certain similarity to a representation of operations by Heyfron and Campbell [35], who use a symmetric order 3 tensor S where our representation uses a symmetric Gram matrix Q, to represent certain operations from the third level of the Clifford hierarchy [36,37]. Perhaps by reconsidering the results of Ref. [35] through a similar careful management of mixed-modulus arithmetic, our techniques may be extended to provide further advances in circuit simplification, for circuits involving controlled-controlled-Z, controlled-S, or T gates. The fact that our techniques represent the global phase of a simulated state, means that they may be also used in conjunction with techniques described in Ref. [11] for extending beyond simulation of stabiliser circuits. Of course, Ref. [11] also provides simulation techniques which extend the stabiliser formalism and track the global phase; it remains to be seen whether there exists applications for which we may use the structure of the circuit to be simulated, to more effectively simulate them using sparse data structures.
Finally, we suggest that quadratic form expansions may be useful pedagogically, for discussions about stabiliser circuits in an educational setting. Most students who first learn of quantum computation will be familiar with the idea of expanding a state in terms of standard basis components, and performing computation within a standard basis vector, both of which are clear features of the quadratic form expansion representation. Entanglement, in the form of states such as 1 √ 2 x |x |x , is also represented explicitly by quadratic form expansions: and while students of physics may find quantum correlations to be adequately represented by correlations of Z and X observables, not all students of quantum computation may find it as easy to reason about observables in this way. This issue also applies more generally to the stabiliser formalism: while the usefulness of the stabiliser formalism is beyond doubt, it is perhaps more conceptually sophisticated than necessary for a first encounter with circuit simulation. In contrast, our techniques may be simplified to be more approachable at an introductory level -for instance, by abandoning the constraint of keeping A in principal row form, and considering simulation techniques requiring O(n 3 ) time using Gaussian elimination to maintain the constraint r = rank(A). This would yield techniques which may be more suitable to teach the simulatability of stabiliser circuits in a first encounter. At the same time, it is possible to describe the limits of extending these techniques to circuits with T or controlled-S gates, by showing how specific steps fail when the relative phase is described as a power of τ = √ i, or when the Gram matrix Q is instead allowed to be replaced with a matrix which is not symmetric. This is a possible avenue to describing efficiently simulatable quantum computations, in a way which may be more approachable at an introductory level.

A Purely number-theoretic Lemmata on Gram matrices modulo 4
We call a function Q : Z r → Z a quadratic form if Q(x) is an integer multivariate polynomial in which each term has degree 2, i.e., if can be expressed as Q(x) = j≤k u j,k x j x k for some coefficients u j,k ∈ Z . When working over a field such as R or Z p for prime p > 2, it is common to consider a symmetric matrix Q such that Q j,j = u j,j , Q j,k = 1 2 u j,k for j < k, and Q j,k = 1 2 u k,j for j > k. The matrix Q is then called the Gram matrix of Q. While Z does not have a multiplicative inverse for 2 (so that not all quadratic forms over Z can be represented in this way), our techniques are concerned with quadratic forms Q over Z, which do happen to arise from a 'Gram matrix' Q in this way. In this case, we may represent Q(x) = x Qx.
As we are exclusively interested in quadratic forms evaluated for vectors x ∈ {0, 1} r , and evaluated as an imaginary exponent i Q(x) to define a relative phase, we may show a few simple but very helpful results about such Gram matrices to support our analysis. In particular, the fact that Q is symmetric means that when considering Q(x) = x Qx mod 4, our analysis in fact benefits from properties which we would normally only expect when evaluating expressions modulo 2: Lemma 16. Let x,x ∈ Z r such that x ≡x (mod 2). Then for any symmetric r × r matrix Q over the integers, x Qx ≡x Qx (mod 4).
Proof. Let v ∈ Z r be such thatx − x = 2v. Then we havẽ This phenomenon in which equivalence mod 2 automatically lifts to equivalence mod 4, extends to a limited extent even to the Gram matrix Q itself: Lemma 17. For r × r symmetric matrices Q and Q over Z, we have x Q x ≡ x Q x (mod 4) for all x ∈ {0, 1} r if and only if Q − Q = 2Γ for some matrix Γ over Z whose diagonal entries are all even.
Proof. Let Q and Q be symmetric r × r matrices over Z; and let ≡ 4 stand for the relation of equivalence modulo 4.
• Suppose that Q − Q = 2Γ for an integer matrix Γ with an even diagonal. Then we have for all x ∈ {0, 1} r .
• Suppose that x Q x ≡ x Q x (mod 4) for all x ∈ {0, 1} r . Then in particular, Q k,k = e k Q e k ≡ 4 e k Q e k = Q k,k ; and It follows that 2(Q − Q) ≡ 0 (mod 4), so that ∆ = Q − Q is a symmetric matrix with only even coefficients and a diagonal which is entirely zero; if Γ = 1 2 ∆, then Γ is an integer matrix with a diagonal which is entirely even.
(We note that the above proof does not rely in a significant way on x having only coefficients in {0, 1}: if we consider x ∈ Z r rather than x ∈ {0, 1} r , the corresponding equivalence also holds.)

ReduceGramRowCol(c)
For an integer 1 ≤ c ≤ r, reduce the coefficient Q c,c modulo 4, and reduce every off-diagonal entry of row and column c modulo 2.
For each 1 ≤ k ≤ r such that Q c,k = 0: • If k = c, reduce Q k,k mod 4; otherwise reduce both Q c,k and Q k,c mod 2. Figure 9: A procedure to simplify the Gram matrix Q in a specific row or column, suitable for when operations have been performed on that row or column.

B Subroutines to transform quadratic form expansions
In Section 3, we describe some techniques to transform quadratic form expansions, and declare subroutines to incorporate these techniques. Here, we describe in pseudocode how these subroutines may be implemented, and describe their run-time bounds in the setting outlined in Section 3.2. Figure 9 defines a simple procedure ReduceGramRowCol(c), taking an argument 1 ≤ c ≤ r indexing a row/column of the Gram matrix Q. It reduces its off-diagonal entries of that row and column modulo 2, and its diagonal entry modulo 4. This subroutine will be suitable to help decrease the number of non-zero entries in that row and column of Q, after some operation has been performed which affects row/column c of Q. Following Lemma 17, this does not change the state which is represented by the quadratic form expansion. We may easily show the following:    The procedure MakePrincipal(c, j) takes a column-index 1 ≤ c ≤ r and a row-index 1 ≤ j ≤ n as arguments, and attempts to perform a change of variables which would make j a principal row with a single non-zero coefficient in column c. It does so in such a way that it does not affect the principal rows h = p(k) for the other columns 1 ≤ k ≤ r with k = c, by only transforming A if column c already has a 1 in row j . If this is the case,

MakePrincipal(c, j)
For integers 1 ≤ c ≤ r and 1 ≤ j ≤ n, if A j,c = 1, perform appropriate changes of variable to transform row j of A to e c , in order to make j a principal row of A.

ReselectPrincipalRow(j, c)
For integers 1 ≤ c ≤ r and 0 ≤ j ≤ n, attempt to find a row j * = j of A, such that A j * ,c = 0, to serve as a new principal row. If no such row is found, the stop without modifying the quadratic form expansion.
1. Find a row index 1 ≤ j * ≤ n such that A j * ,c = 0 and j * = j, for which row j * of A has the fewest non-zero entries. (Let j * = 0 if no such rows exist.) 2. If j * = 0, call MakePrincipal(c, j * ).

Proof.
Step 1 takes time O(1) to determine whether A j,c = 1. Assuming that the procedure does not terminate at that stage, s j , t c ≥ 1.
Step 2 then iterates through s j − 1 column indices k = c, subtracting column c of A from column k (and performing O(t c +w c ) operations) by calling ReindexSubtColumn(k, c) on each iteration. Among other changes, this realises an update A j,k ← 0 for each such k. Note that as the other principal rows h have A h,c = 0 by definition, these rows are unaffected by these column operations.
Step 2 dominates the run-time if w j > 1, performing O(s j (t c + w c )) operations; if s j = 1, it instead performs O(1) operations.
Step 3 also performs O(1) operations, simply updating the value of p(c).
Using the procedure MakePrincipal as a subroutine, we define a procedure ReselectPrincipalRow(j, c), which attempts to select a new principal row for the setting where j = p(c) is a principal row, corresponding to a qubit which is to be subject to an operation such as a Hadamard gate. This procedure is used when a simulated operation would significantly disrupt a principal row for some column c, which we may be able to avoid by selecting an alternative principal row j * . (This is not possible when row j is the only row in which column c contains a non-zero entry, in which case the matrix A and the value of p(c) will be unchanged.) To reduce the amount of work that is required to establish a new principal row, we select the new principal row by minimising the number of non-zero entries which must be cleared to make it a principal row.
Lemma 21. Let 1 ≤ c ≤ r be a column index for an n × r matrix A with a principal index map p, and either j = 0 or j = p(c). Let t c ≥ 0 (respectively, w c ) be the number of non-zero entries in column c of A (respectively, in row/column c of Q).

FixFinalBit(z)
Perform a transformation on the quadratic form expansion, consistent with fixing the value of x r = z, reducing the rank in doing so.

Modify
A by removing column r, and modify Q by removing column/row r.
4. Update r ← r − 1. B.4 Fixing the value of the final index Figure 12 defines the subroutine FixFinalBit(z). This procedure realises the transformations to the quadratic form expansion described in Section 3.4, to represent either a simplification or transformation of a quadratic form expansion in which the final bit of the summation index x ∈ {0, 1} r is set to some constant z ∈ {0, 1}. We note that in the analysis of Section 3.4, the right-hand side of Eqn. (15) features an extra factor of 1 √ 2 in addition to the quadratic form expansion described in square brackets. In some instances, this scalar factor may represent the fact that a measurement which fixes the value of x r , yields the particular outcome z only with probability 1 2 . In other instances, this factor may be canceled out by some other scalar factors which we may apply to |ψ . In each case, this factor of 1 √ 2 can be accounted for in an appropriate way. However, it is left to the programmer to ensure that this is done properly: FixFinalBit(z) does not attempt to represent or otherwise account for that scalar factor.

Lemma 22.
Let t r ≥ 0 (respectfully, w r ) be the number of non-zero entries in the final column of A (respectfully, the final row/column of Q). Then for a bit z ∈ {0, 1}, the procedure FixFinalBit(z) transforms the quadratic form expansion consistent with introducing a factor δ xr,z as in Eqn. (12) and reducing it to the form in square brackets in the right-hand side of Eqn. (15), in time O(t r + w r ).

ZeroColumnElim(c)
Eliminate (one or two) redundant columns from the matrix A of a quadratic form expansion, given that A has r+1 columns but rank r, and that column c in particular is entirely zero. (This will in general involve other non-trivial changes to A.) 1. Call ReindexSwapColumns(c, r+1).

Modify
A by removing column r+1, and modify Q by removing column r+1 and row r+1.
4. If u is odd, update Q ← Q + (u − 2)qq mod 4 and g ← g − u + 2. Otherwise, if u is even: a. If q = 0, then stop; otherwise, find an index 1 ≤ ≤ r such that q = 1.
Step 1 initialises a vector a ∈ {0, 1} n in time O(t r ) by reading column r of A, and initialises a vector q ∈ {0, 1} r−1 and coefficient u ∈ {0, 1} in time O(w r ) by reading column r of Q.
Step 2 modifies A and Q by removing column r (and in the case of Q also row r); this might be done by at the same time as initialising a and q in Step 1, by appropriate operations on the sparse data structures representing A and Q. In Step 3, the update Q ← Q + 2 z diag(q ) may be performed in time O(w r ), the update b ← b ⊕ z a may be performed in time O(t r ), and the update g ← g + 2zu may be performed in time O(1); and similarly for the update r ← r−1 in Step 4. The run-time is thus O(t r +w r ).
B.5 Eliminating columns which are entirely zero Figure 13 defines the subroutine ZeroColumnElim(c), to simplify a quadratic form expansion as described in Section 3.5 by eliminating one or more columns from A, when it has a rank r which is one less than the number of its columns.
but that A has rank only r, and that in particular column c of A is entirely zero; and that p is nearly a principal index map for A, except again in that column c of A is entirely zero. Then the procedure ZeroColumnElim(c) transforms the quadratic form expansion to a form consistent with Eqn. (8), in which A is in principal row form with a corresponding principal index map p, in time O(tw + w 2 ).

Proof.
Step 1 interchanges column c and column r+1 of A, requiring time O(2t + 2w) to do so.
Step 2 initialises the vector q and the scalar u, which may be done in time O(w).
Step 3 modifies A and Q by removing column r+1 (and in the case of Q also row r+1); no explicit modifications may be necessary to the sparse data structure of A (as the column being removed has no non-zero entries), and the modification to Q might be done by appropriate operations on the sparse data structure representing Q while initialising q in Step 2.
If u is odd, then in Step 4 we modify the value of g, and up to w diagonal entries of Q, requiring time O(w 2 ); the total run-time is then O(t + w 2 ). If u is even, we instead perform a number of further operations, which depend on a column index 1 ≤ ≤ r for which q = 1.
Step 4a attempts to find such an , taking O(1) time. We stop the procedure if no such exists. Otherwise, we invoke ReindexSubtColumn(k, ) for up to w − 1 values of 1 ≤ k ≤ r in Step 4b (requiring O(wt + w 2 ) operations in total), invoke ReindexSwapColumns(r, ) in Step 4c (involving O(2t + 2w) operations), and call FixFinalBit(q/2) in Step 4d (involving O(t + w) operations). The total run-time in this case is then O(tw + w 2 ).