The qudit Pauli group: non-commuting pairs, non-commuting sets, and structure theorems

Qudits with local dimension $d>2$ can have unique structure and uses that qubits ($d=2$) cannot. Qudit Pauli operators provide a very useful basis of the space of qudit states and operators. We study the structure of the qudit Pauli group for any, including composite, $d$ in several ways. To cover composite values of $d$, we work with modules over commutative rings, which generalize the notion of vector spaces over fields. For any specified set of commutation relations, we construct a set of qudit Paulis satisfying those relations. We also study the maximum size of sets of Paulis that mutually non-commute and sets that non-commute in pairs. Finally, we give methods to find near minimal generating sets of Pauli subgroups, calculate the sizes of Pauli subgroups, and find bases of logical operators for qudit stabilizer codes. Useful tools in this study are normal forms from linear algebra over commutative rings, including the Smith normal form, alternating Smith normal form, and Howell normal form of matrices. Possible applications of this work include the construction and analysis of qudit stabilizer codes, entanglement assisted codes, parafermion codes, and fermionic Hamiltonian simulation.

Central to the understanding of qubits is the Pauli group on n qubits, a subgroup of U(2 n ) generated by products and tensor products of X = ( 0 1 1 0 ) and Z = 1 0 0 −1 .The Pauli group has myriad uses in quantum theory, including facilitating the description of stabilizer states, errorcorrecting codes, and quantum errors [19].The qudit Pauli group, or Heisenberg-Weyl group, plays a similar role in understanding qudits [20].Here we take Rahul Sarkar: rsarkar@stanford.edu,Rahul Sarkar was funded by the Stanford Exploration Project for the duration of this study.Theodore J. Yoder: ted.yoder@ibm.com, where ω = e 2πi/d , as generators of the n-qudit Pauli group, a subgroup of U(d n ).All matrices X a Z b for a, b ∈ Z d are distinct.If we take the group commutator [A, B] := ABA −1 B −1 of two single-qudit Paulis, we find [X a Z b , X a ′ Z b ′ ] = ω a ′ b−b ′ a I, for identity matrix I. Thus, while qubit Paulis P, Q only either commute [P, Q] = I or anticommute [P, Q] = −I, qudit Paulis can fail to commute in any one of d − 1 distinct ways, [P, Q] = ω c I for any c = 1, . . ., d − 1.This is one reason there is interesting structure in the qudit Pauli group that is absent for qubits.
In this paper we communicate several structural theorems about sets and groups of qudit Pauli operators.We think it is illustrative to present our results in comparison with the corresponding ideas for qubits, which might be familiar to some readers.
Non-commuting pairs.We say Paulis s 0 , . . ., s k−1 and t 0 , . . ., t k−1 are non-commuting pairs if all pairs (s i , t i ) do not commute with each other, but do commute with all other s j and t j for j ̸ = i.For qubits (and in fact any prime d), the maximum number of non-commuting pairs on n qudits is simply k = n, for instance (X, Z) on one qubit.However, for composite d, we can have more.For example, d = 6 permits (X 2 , Z 2 ) and (X 3 , Z 3 ) as a collection of k = 2 non-commuting pairs on one qudit.In Section 3, we will show that the maximum number of non-commuting pairs is k = mn, where m is the number of unique primes in the factorization of d.We use this result to answer some further questions in Sections 5 and 6.The key tool used to obtain this result is Lemma 3.1, which also finds use in Section 5.
Non-commuting sets.A set of Paulis {q 0 , q 1 , . . ., q h−1 } is non-commuting if [q i , q j ] ̸ = I for every i ̸ = j.On a single qubit, an example largest non-commuting set is {Z, X, XZ} and, more generally for prime d, a largest set has size h = d + 1.In contrast, on a single d = 6 qudit, the largest non-commuting sets have size h = 12, e.g.{Z, X, XZ, XZ 2 , XZ 3 , XZ 4 , XZ 5 , X 2 Z, X 2 Z 3 , X 2 Z 5 , X 3 Z, X 3 Z 2 }. (2) We will show in Section 4 that h = Ψ(d) ≥ d+1 is the maximum size non-commuting set on a single qudit, where Ψ is the Dedekind Psi function from number theory.The situation of non-commuting sets on more than one qudit is complicated, and we discuss some bounds on the size in that setting.
We also note here that if one additionally requires the Paulis in the non-commuting set to all have the same group commutator value, the size of the maximum set is 2n + 1, independent of d [21].
Qudits needed to create commutation patterns.Suppose C ∈ Z l×l has zero diagonal, C ii = 0 for all i, and is anti-symmetric, C = −C T .These two properties define what is known as an alternating matrix.What is the minimum number of qudits n such that there exists a list of n-qudit Paulis {q 0 , q 1 , . . ., q l−1 } with [q i , q j ] = ω Cij I? In Appendix B of [22] and in [23], it was shown that the number of qubits required is half the rank of C as a matrix over the field F 2 .For instance, if then two qubits are necessary and sufficient via the set {X ⊗ I, Z ⊗ I, I ⊗ X, I ⊗ Z}.However, just one d = 15 qudit suffices: {X 3 , Z 9 , X 5 , Z 5 }.In Theorem 5.4 of Section 5, we show that in general, the number of qudits is half the minimal number of vectors generating the column-space of C as a matrix over Z d , which can differ from other definitions of matrix rank when d is composite.We also give an efficient algorithm to find a satisfying set of Paulis using a matrix decomposition called the alternating Smith normal form [24].This approach effectively reduces the problem to the construction of collections of non-commuting pairs, solved in Section 3.
Minimal and Gram-Schmidt generating sets.Suppose A ⊗3 is shorthand for A ⊗ A ⊗ A, S = {X ⊗3 , Z ⊗3 , ω(XZ) ⊗3 } is a set of qudit Paulis, and G = ⟨S⟩ is the group generated by S. What is the smallest generating set of G? We answer this question in general in Section 6.1 using the Smith normal form of the matrix over Z d representing S. For d = 2, ω = −1 and a smallest generating set for this example is {X ⊗3 , Z ⊗3 }.However, for d = 6, the last generator contributes a phase unobtainable from the first two generators alone and so any minimal generating set is larger, e.g.{X ⊗3 , Z ⊗3 , ωI ⊗3 }.A Gram-Schmidt generating set for G is one that can be written as some number of non-commuting pairs (s i , t i ) and additional elements u i that commute with all elements of G.If G is the centralizer of a qudit stabilizer group, then the non-commuting pairs in Gram-Schmidt generating set provide a basis for the logical operators of the stabilizer code, as is done for qubits in [25].For instance, {X ⊗3 , Z ⊗3 , ωI ⊗3 } is also a Gram-Schmidt generating set for the example G with d = 6, where the first two elements are a non-commuting pair.Our construction of Gram-Schmidt generating sets in Section 6.2 guarantees the minimal number of non-commuting pairs, and again makes use of the alternating Smith normal form.
Subgroups generated by maximum non-commuting pairs.For qubits (and other qudits with prime d), maximum collections of non-commuting pairs generate the full Pauli group.This is clearly the case, for instance, for the maximum collection (X, Z) on a single prime-d qudit.However, for composite d this is no longer necessarily the case.For instance, for d = 12, (X 3 , Z 6 ) and (X 4 , Z 4 ) generate only ⟨ωI, X, Z 2 ⟩, a proper subgroup of the full single-qudit Pauli group, despite being a maximum size collection of non-commuting pairs.We show in Section 6.3 that this can only happen when d is not square-free.We also go further and find the size of groups generated by arbitrary generating sets as well.

Related work
In the remainder of this section, we discuss prior works and similar results in the existing literature, and mention applications and contributions of our current results.
One of the important uses of qubit based quantum computers is the simulation of Hamiltonians of quantum systems, such as atoms or molecules.The way this is achieved is by mapping the algebra of fermions or bosons, depending on whether one wants to simulate a fermionic or bosonic system, to the algebra of qubits.Popular encoding techniques to achieve such a transformation are the Jordan-Wigner transform [26], the Bravyi-Kitaev encoding [27], the Verstraete-Cirac mapping [28], Fenwick trees [29], and ternary trees [30], but other options also exist in the literature [31,32,33,34,35].One desirable property during this mapping process is to use as few qubits as possible; in fact this has been the focus of the methods described in [31,32,33,35].Equivalently, this problem is the same as finding a set of Pauli operators using the minimum number of qubits, where the Paulis must satisfy a certain set of commutation relations -for example in the case of fermionic simulation, the Paulis that the fermionic modes are mapped to must anticommute with one another.This exact question for the qubit case has been answered in [23], and the result there generalizes to all qudits of prime dimension.Here, we cover the case of qudits of composite dimension.
The question of finding Paulis using a minimum number of qubits that satisfy a prescribed set of commutation relations also comes up in the context of entanglement assisted quantum error-correcting codes (EAQECC) [36,37,38], and quantum convolutional codes [39].A typical EAQECC is described as [[n, k, d; c]], where n qubits are used to encode k logical qubits with code distande d, and using c ebits (or extended qubits).The setup for EAQECC is as follows: we initially have a set of Paulis {p 0 , p 1 , . . ., p k−1 } on n qubits which do not necessarily commute, and then one seeks to create stabilizer codes out of them by using a set of Paulis {q 0 , q 1 , . . ., q k−1 } supported on c extended qubits, such that the set of Paulis {p 0 ⊗ q 0 , p 1 ⊗ q 1 , . . ., p k−1 ⊗ q k−1 } on n + c qubits form a valid set of stabilizer generators (meaning that they must mutually commute).Clearly, if one wants to be efficient with respect to the number of physical qubits used to encode a fixed number k of logical qubits, one would want to minimize the number c of extended qubits used in the process (we assume n is fixed).This leads to the task of finding these Paulis {q 0 , q 1 , . . ., q k−1 } such that [p i , p j ] = [q i , q j ], for all 0 ≤ i, j ≤ k − 1, while at the same time minimizing c.Now, one may use the same procedure to construct EAQECC over qudits, and this has been previously considered in [38, Remark 1], and the result there again applies to prime dimensional qudits only.In the case of quantum convolutional codes, an exactly analogous thing happens -there one seeks to find Paulis supported on a minimum number m of memory qubits that satisfy a memory commutativity matrix specified by the input-output commutation relations [39,Theorem 2].In order to generalize these convolutional codes to the qudit case, one needs to be able to answer the same question for any qudit dimension d.Our results in Section 5 achieve exactly this.
Yet another instance where the problem of finding Paulis with specified commutation relations comes up is in the setting of parafermion codes.Parafermion codes [21] are the higher dimensional generalization of Majorana codes [40,41], just as qudit stabilizer codes generalize qubit stabilizer codes.A key part of the translation between Majorana codes and qubit codes is the construction of sets of qubit Paulis with certain commutation relations, matching those of the underlying Majorana operators [22].We anticipate a similar translation between parafermion codes and qudit stabilizer codes to involve constructions of qudit Paulis with specified commutation relations that we provide in Section 5.It is to be noted that much of the results in Sections 3 and 4 set up the necessary machinery to be able to resolve this central question of Section 5.
In [42], Georghiu provides a construction of the standard form of a qudit stabilizer group S, i.e. an abelian subgroup of the qudit Pauli group.This is similar to Gottesman's standard form [19] for prime d-dimensional qudits and likewise makes use of Clifford transformations of the group.Our constructions of minimal and Gram-Schmidt generating sets apply to any Pauli subgroup and do not involve applying Cliffords.In the qubit case, such a Gram-Schmidt generating set is useful for improving simulations of stabilizer circuits [43].We can also efficiently calculate the size of a qudit stabilizer group and thus the dimension of the stabilized codespace using the results in Section 6.3.Qudit stabilizer codes have also been studied previously in [44,45] in the context of quantum codes with exotic local dimensions, qudit surface codes, and qudit hypermap codes, and some of our results in Section 6 could be applicable to those works also.

Preliminaries
In this section, we introduce some background material, some definitions and notation to be used throughout in the paper.We also review some necessary facts from ring theory and establish a few basic results that are helpful later on.Throughout the paper, entries of matrices and vectors (both row and column) will follow zero-based indexing.

Qudit Pauli group
Consider a collection of n qudits, each with dimension d.Thus each qudit corresponds to a complex d-dimensional Hilbert space C d with a basis of orthonormal states: |0⟩, |1⟩, . . ., |d − 1⟩.The Pauli operators on one qudit are the invertible operators X and Z from Eq. (1) which we will think of concretely as being given by their d × d matrix representations with respect to the chosen basis.Thus X, Z ∈ C d×d , and notice that X d = Z d = (−1) d−1 (XZ) d = I, where I is the d × d identity matrix.We will always use I to denote a context-appropriately-sized identity matrix and specify its shape only when there is a chance for confusion.No smaller number 0 < a < d results in X a , Z a , or (XZ) a even being proportional to I.
To define Pauli operators on n qudits, start with the disjoint union space Z := ∞ n=1 Z 2n d , and the map Then the set of n-qudit Pauli operators is These are abbreviated as X(u) and Z(u) respectively, where u ∈ Z n d specifies just the nonzero entries of v.
The set P n does not form a group.However, the set P n := {ω j P (v) : v ∈ Z 2n d , j ∈ Z d } does form a group with the operation of matrix multiplication, called the Heisenberg-Weyl Pauli group.Defining K = {ω j I : j = 0, 1, . . ., d − 1}, we may identify P n in one-to-one fashion with the quotient group P n /K in the obvious way by ignoring phase factors.
Many of our results arise from interest in commutativity relations of elements of P n .For this, we start by noticing that XZ = ω −1 ZX.Defining the group commutator of two invertible matrices [A, B] := ABA −1 B −1 , we then have [X, Z] = ω −1 I.This implies that for any two elements ω j P (u) where Λ is written in four n × n blocks.The first equality of this equation suggests that it suffices to work with the set P n for studying commutation relations, and this is largely what we will do outside of Section 6, where some more notation will be introduced just for the material in that section.To isolate the phase, we also define the notations Two Pauli operators P (u), P (v) ∈ P n are said to commute if and only if P (u), P (v) d = 0, because that implies P (u)P (v) = P (v)P (u).
We now introduce some key definitions that underlies much of our results.First, in Section 3, we determine the maximum size of collections of non-commuting pairs.Definition 2.1.A collection of k non-commuting pairs on n qudits consists of two ordered sets S = {s 0 , s 1 , . . ., s k−1 }, and T = {t 0 , t 1 , . . ., t k−1 }, where S, T ⊆ P n , for all i, j ∈ {0, 1, . . ., k − 1}, s i , s j d = 0 and t i , t j d = 0, and moreover s i , t j d = 0 if and only if i ̸ = j.A collection of k non-commuting CSS pairs is a collection of non-commuting pairs where every element of S is X-type and every element of T is Z-type.
Thus, in a collection of k non-commuting pairs, there are a total of 2k Paulis, and they all commute except for the k pairs (s i , t i ).An easy consequence of this definition is that all elements of S and T are necessarily distinct from one another and not equal to I.
Later, in Section 5, we will also be concerned with what values the commutators of a collection of non-commuting pairs can have.Definition 2.2.Suppose we are given n qudits, and a k-tuple of numbers f := (f 0 , . . ., f k−1 ) ∈ (Z d \ {0}) k .We say that f is an achievable non-commuting pair relation on n qudits if there exist non-commuting pairs S = {s 0 , s 1 , . . ., s k−1 } and T = {t 0 , t 1 , . . ., t k−1 } of size k, such that s i , t i d = f i , for all i = 0, . . ., k − 1.We say that S, T achieves f .
In Section 4, we count the elements of non-commuting sets.Definition 2.3.A non-empty subset S ⊆ P n is called a non-commuting set on n qudits if and only if p, q d ̸ = 0 whenever p, q ∈ S are distinct.
It should be clear from this definition that a non-commuting subset S ⊆ P n cannot contain I and all elements of S must be distinct.

Modules over commutative rings
Our next goal is to provide a short review of some basic results in linear algebra over the commutative ring Z d with d ≥ 2, for the reader who is unfamiliar with these notions.The main difficulties in working with Z d and matrices over Z d arise when d is composite, because then and only then is Z d a ring and not a field.We introduce some ring-theoretic terminology here that will find use in our proofs, and some well-known canonical matrix decompositions such as the Smith normal form and the Howell normal form, which for certain applications generalize the utility of the singular value decomposition and reduced row echelon form, respectively, to certain classes of commutative rings.Separately, we dedicate a section (Appendix A) to the alternating Smith normal form, which is central to this paper and provides a symmetry-respecting Smith normal form for matrices with alternating symmetry, to be defined later.The related material is mostly borrowed from [46, Chapters II, III, X, XIII, XV], and [47, Chapters I-V, XIV,XV].

Basic definitions
Generically, we denote a ring by R, and assume, unless stated otherwise, it is non-zero, commutative, and contains a multiplicative identity 1 and additive identity 0. Largely, we work over the ring of integers Z or the ring of integers modulo d, Z d .If a, b ∈ R such that a = bc for some c ∈ R, we say b divides a, and we write b | a.An element x ∈ R is called invertible (or unit) if it has a multiplicative inverse.The subset of units of a ring form a multiplicative group called the group of invertible elements.Ring R is called a field if every non-zero element is a unit.R is called an integral domain if ab = 0 for a, b ∈ R, implies either a = 0 or b = 0.All finite integral domains, such as Z d for prime d, are fields by Wedderburn's little theorem [48], but this is not so for infinite rings, like Z.For a ring R that is not an integral domain, an element An ideal J is an additive subgroup of R that is also closed under multiplication by R, i.e.J = RJ := {rj : j ∈ J, r ∈ R}.An ideal J ⊆ R is called a principal ideal if it is generated by a single element, i.e.J = xR := {xy : y ∈ R}, for some x ∈ R. A commutative ring R is called a principal ideal ring (or PIR), if all its ideals are principal ideals.It is well-known that Z d is a PIR [49,Chapter 2.2,Ex. 8].An integral domain that is also a PIR is called a principal ideal domain (or PID).For example Z is a PID, as it can be shown that any ideal of Z is of the form xZ (hence a principal ideal), for some x ∈ Z.
Given a ring R, a left module over R, or simply a left R-module M is an abelian group together with an operation • : R × M → M , such that for all a, b ∈ R and x, y ∈ M we have One can analogously define a right module over R, but any right R-module can be viewed as a left R-module and vice-versa, and for us this distinction will not be needed.Thus when the ring R is understood, we will simply say "a module M ".A ring R is also a R-module, with the • operation being the same as ring multiplication.The most important modules we will have to deal with are the Z d -module Z k d and the Z-module Z k , i.e. the set of all k-tuples of elements of the ring Z d and Z respectively, where the operation • is that of ring multiplication applied to each component of the tuple.Given a R-module M , a subset S ⊆ M is called a generating set (over R) of M if any y ∈ M can be expressed as y = k i=1 a i x i for some finite k, a 1 , . . ., a k ∈ R, and x 1 , . . ., x k ∈ S. We say that M is finitely generated if it has a finite generating set.A non-empty subset S ⊆ M is called linearly independent (over R) if and only if the following condition holds for every finite positive integer k, a 1 , . . ., a k ∈ R, and distinct elements x 1 , . . ., x k ∈ S: A linearly independent subset S ⊆ M that is also a generating set of M , is called a basis of M .We say that a R-module M is a free R-module if M has a basis, or is the zero module.For example, it is easy to show that both Z k d and Z k are finitely generated and free: in both cases the set S = {e 1 , e 2 , . . ., e k } is a basis for these modules, where (e j ) i = 1 if i = j, else 0.

Minimal generating sets
A commutative ring R with a multiplicative identity element is an invariant basis number ring (see [50,Chapter 4,Definition 2.8] and [51, Section 1A] for a discussion on invariant basis number rings).This means that any finitely generated free R-module M has a unique finite dimension given by the size of any (hence all) basis of M , which we will call the rank of M where, by convention, the zero module has zero rank.For a proof of this statement, the reader is referred to [47,Corollary 5.13], and the discussions thereafter.For example, the modules Z k d and Z k discussed above both have rank k.Given a R-module M , a subset M ′ ⊆ M is called a submodule if it is also a R-module.A key fact is that a submodule of a free module need not be free; for example, consider the submodule {0, 2} of Z 4 which does not have a basis.Thus, apriori, the rank of the submodule M ′ is not defined.
In order to be able to extend the notion of rank to submodules of a finitely generated free R-module, we need the notion of minimal generators.Given a finitely generated R-module M , a generating set S ⊆ M is called a minimal generating set of M if there does not exist another generating set S ′ ⊆ M such that |S ′ | < |S|.The quantity |S| is called the minimal number of generators of M , and we will denote it as Θ(M ).For example, for the submodule {0, 2} of Z 4 , the generating set {2} is a minimal generating set, and so Θ({0, 2}) = 1.If M is free, it is a well-known result that Θ(M ) equals the rank of M , since R is commutative.So Θ(M ) is indeed a generalization of rank to non-free modules.
Our main interest in the minimal number of generators is motivated by the need to define a quantity analogous to the column rank of a matrix with entries in a field, for the situation we encounter in this paper where the entries of the matrix belong to a commutative ring R (in particular Z or Z d ).Suppose we have a matrix C ∈ R k×t , and we label each of its t columns as c 0 , c 1 , . . ., c t−1 .Each column is an element of the R-module R k , and together they generate a submodule of R k , namely We also note an important special case in the next lemma, still assuming that R is a commutative ring with a multiplicative identity, where the minimal number of generators for a matrix with special structure is known.We thank Luc Guyot for an outline of this proof in [52].See Appendix B.1 for the proof.Lemma 2.2.Let C ∈ R k×t be a matrix such that each row and column has at most one nonzero element, and suppose one of those non-zero elements is divisible by all the others.Then the minimal number of generators of the R-module M C , generated by the columns of C, is equal to the number of non-zero elements of the matrix C.

Smith normal form
Let R be a commutative ring with multiplicative identity.Following [47,Chapter 15], R is called an elementary divisor ring if for every k, ℓ ≥ 1 and for every C ∈ R k×ℓ , there exists invertible matrices B ∈ R k×k and B ∈ R ℓ×ℓ , such that (i) A := BC B is a diagonal matrix, and (ii) where r = min{k, ℓ}, and a i := A ii for every i.The diagonal matrix A is called the Smith normal form (SNF) of the matrix C, and we say that C admits a diagonal reduction.The non-zero elements of the diagonal matrix A are called the invariant factors of C.
We recall the following well-known result about the Smith normal form of a matrix with entries from a principal ideal ring R: ([47,Theorems 15.9,15.24]).If a commutative ring R is a principal ideal ring, then for all k, ℓ ≥ 1, every matrix C ∈ R k×ℓ admits a diagonal reduction A := BC B, where B ∈ R k×k and B ∈ R ℓ×ℓ are invertible matrices, and A is the Smith normal form of C. The Smith normal form A is unique up to multiplication of the diagonal elements of A by units of R.
The following result about the SNF of a matrix and its relation to the minimal number of generators of the module generated by its columns, will be useful in Section 6, and follows as a direct consequence of Lemma 2.1 and Lemma 2.2: Lemma 2.4.Let R be a commutative principal ideal ring.Suppose C ∈ R k×ℓ has a Smith normal form A = BC B, for invertible matrices B, B and A a diagonal matrix of appropriate shape.Let M C be the submodule generated by the columns of C. Then Θ(M C ) equals the number of non-zero elements of A.
Choosing integer n so that k, ℓ, r = O(n), finding the Smith normal form of a matrix takes Õ(n θ+1 ) time [53] where Õ hides log factors and assuming 2 < θ ≤ 3 is the exponent of matrix multiplication.

Howell normal form
We now introduce the Howell normal form of a matrix with entries in the ring Z d , which will be useful for us in Section 6.The material here is borrowed from [54,55].Although we only discuss the Z d case here, the Howell normal form is also valid for a matrix over a principal ideal ring, and the reader is referred to [53,56] for details.Also note that we present the Howell normal form in terms of column spans of a matrix, instead of row spans as done in prior literature (of course they are equivalent).
A matrix H ∈ Z k×ℓ d is said to be in reduced column echelon form if H satisfies the following properties: (i) If H has r non-zero columns, then its first r columns are non-zero.
(ii) For 0 ≤ i ≤ r − 1, let j i be the row index of the first non-zero entry of column i.Then (iii) For 0 ≤ i ≤ r − 1, the matrix entry H jii is a divisor of d over integers.
If the matrix H only satisfied properties (i) and (ii) above, then it can be made to satisfy properties (iii) and (iv) also by post-multiplying H on the right by an invertible matrix over Z d .We say that H is in Howell normal form (HNF), if in addition to properties (i)-(iv), the matrix H satisfies the following additional property: (v) Let M H be the submodule of Z k d generated by the columns of H.For each 0 ≤ i ≤ r − 1, consider the submodule M i := {v ∈ M H : v t = 0, ∀ 0 ≤ t < j i }.Then M i is generated by columns i to r − 1 of H, for every 0 ≤ i ≤ r − 1.
Let us now state the main result about the Howell normal form.Given A ∈ Z k×ℓ d , we construct A := [A 0], by adding k zero columns to A. Then there exists an invertible matrix L ∈ Z As part of the calculation of the Howell normal form, some algorithms (e.g.[53]) include computation of a matrix G ∈ Z (k+ℓ)×(k+ℓ) d , whose columns generate the kernel of H, Ker(H) = {x ∈ Z ℓ d : Hx = 0}.So HG = 0.Moreover, the columns of LG provides a generating set for the kernel of A. Choosing integer n so that k, ℓ, r = O(n), finding both G and H takes Õ(n θ+1 ) time.
With the Howell normal form H and kernel G, we can test membership in the column space of A, or equivalently solve linear systems, Ax = b, where we are given b ∈ Z k d and want to determine whether x ∈ Z k+ℓ exists.Note that by invertibility of L, this is the same as finding a y = L −1 x such that Hy = b.Also, if y is a solution then so is y + Gv for any v ∈ Z ℓ d .Solving the inhomogenous equation Hy = b is made easy by properties (i-v).In principle, one can observe, [53] where the righthand-side is the Howell normal form of H b and the invertible matrix L b relating them necessarily has the indicated form.Now, b ′ = 0 if and only if Hy = b.In practice, one does not need to run the Howell normal form algorithm on H b , but can instead achieve the same result by solving Hy = b one entry at a time using the fact H is in reduced row echelon form and has the Howell property (v).

Maximum number of non-commuting pairs of Paulis
In this section, we give a precise count of the maximum number of non-commuting pairs that one can achieve with n qudits.This generalizes the well-known result in the qubit (d = 2) case, where this count is equal to the number of qubits [19] and prepares for our construction in Section 5 of Pauli sets satisfying given commutation relations using the minimum number of qudits.
It is easy to establish the lower bound nm on the maximum number of non-commuting pairs as illustrated by the following example.

Example 1. Define the sets S
Then S, T is a collection of m non-commuting CSS pairs on a single qudit.On n qudits, this example can be applied individually to each qudit to get a collection of non-commuting CSS pairs of size nm.
We now show that nm is also the upper bound.The key result that underlies the proof is the following number-theoretic obstruction: Lemma 3.1 (Obstruction lemma).Suppose ℓ = p γ , for some prime number p, and positive integer γ ≥ 1.For some k ≥ 2, let A be a k × k matrix with entries in Z. Then the following conditions cannot hold simultaneously (i) det(A) = 0, (ii) A contains exactly k entries that are not divisible by ℓ, such that each row and column of A contains exactly one such element.
Proof.For contradiction, assume that such a matrix A exists satisfying both conditions (i) and (ii).
Let us denote the columns of A as c 0 , . . ., c k−1 .We claim that there exist integers x 0 , . . ., x k−1 , not all zeros, such that k−1 i=0 x i c i = 0, which we prove at the end.Now pick any row j.This row contains an element A jq which is not divisible by ℓ, while all other elements of this row are divisible by ℓ.From the condition k−1 i=0 x i c i = 0, we thus we have The right hand side of Eq. ( 9) is divisible by ℓ, and A jq is not divisible by ℓ, so we deduce that x q is divisible by p.By repeating this argument for each row, and using condition (ii), we deduce that x i is divisible by p for each i = 0, . . ., k − 1. Defining x ′ i = x i /p for each i, we now obtain We can keep repeating the argument, giving an infinite sequence of integers But this implies x i = 0 for every i, giving a contradiction.
We now prove the claim made in the previous paragraph.We may regard A as a matrix over the set of rational numbers Q, and then by condition (i) we have that det(A) = 0 over Q.Thus the columns c 0 , . . ., c k−1 are linearly dependent as column vectors over the field Q, which implies there exists y 0 , . . ., y k−1 ∈ Q, not all zeros, such that k−1 i=0 y i c i = 0. Multiplying through by the least common denominator of the y i immediately proves the claim.
We can now prove the following theorem: Theorem 3.2 (Largest size non-commuting pairs).The largest size of a collection of non-commuting pairs on n qudits is nm.
Proof.Suppose by way of contradiction that we have a collection of k > nm non-commuting pairs S = {s 0 , . . ., s k−1 } and T = {t 0 , . . ., t k−1 }.Then for every j, s j , t j is not divisible by d.By the pigeonhole principle, there is some prime p (from the factorization of d) and a set J ⊆ {0, 1, . . ., k − 1} of n + 1 indices so that for each j ∈ J, s j , t j is not divisible by p α , where α is the largest integer such that p α divides d.Without loss of generality, one may assume that J = {0, 1, . . ., n}.
However, we also claim det(M ) = 0. Thus, Lemma 3.1 gives our contradiction.To prove that the determinant is zero, we take vectors u 0 , . . ., u 2n+1 ∈ Z 2n d , so that s i = P (u i ) and t i = P (u i+n+1 ), for every i = 0, . . ., n.Then M ij = u T i Λu j ∈ Z, where Λ is defined in Eq. ( 5).We now lift M and the vectors u i to the field of real numbers, i.e. we may treat M ∈ R 2(n+1)×2(n+1) , and u i ∈ R 2n for every i.Because there are 2(n + 1) vectors u i for i = 0, . . ., 2n + 1, it now follows that there are some real numbers x 0 , . . ., x 2n+1 , such that 2n+1 i=0 x i u i = 0. From this it follows that the columns of M (and the rows) are also linearly dependent and det(M ) = 0 over the reals, and hence also over Z.
Note a consequence of the theorem above: Proof.If G contained such a non-commuting pair (s, t), then the ordered sets S ′ := S ∪ {s} and T ′ := T ∪ {t} would be a non-commuting pair of size nm + 1, which contradicts Theorem 3.2.

Maximum size of a non-commuting set
It is well-known in the qubit case (d = 2), that the maximum size of an anticommuting set on n qubits is 2n + 1 (see [57,Lemma 8], [58,Appendix G], [59,Theorem 1]).In Section 4.1, we generalize this result to a single qudit of arbitrary dimension d, and in Section 4.2 we provide bounds for the multi-qudit case.The case of constructing Pauli sets satisfying the commutation relations of parafermion operators [21] is a special case of the results in this section, which we specifically solve later in Corollary 5.5.

Non-commuting sets on a single qudit
In this section, we evaluate gcd of elements of Z (or Z d considered as elements of Z) over Z and define it to be positive to ensure that it is unique.Also whenever we write a/b for a, b ∈ Z d (or Z), we will always mean division over integers (in particular this must mean that b divides a over integers).Our goal in this section is to prove the following theorem: Recall from the paragraph below Definition 2.3 that it suffices to find the size of a largest non-commuting subset of P 1 .The basic strategy of the proof is now to reduce this problem to finding a maximum clique a in a graph.To achieve this, we start by noticing that if ( a b ) , ( s t ) ∈ Z 2 d , then for the Paulis P (( a b )) and P (( s t )) on 1 qudit represented by these vectors, Eq. ( 6) simplifies to As a consequence, finding the largest set of non-commuting Paulis on 1 qudit is equivalent to finding a maximum clique C m in the undirected graph This commutation graph (or its complement) has been studied before in relation to projective geometry in e.g.[61,62,63] and mutually unbiased bases [64].We will show in several steps below that |C  In the remainder of the proof of Theorem 4.1, we will adopt the following notation: any product of the form p|r , where r is a positive integer, will mean that the product is taken over all unique primes p that divide r.To count |W |, we let (a, b) ∈ W and allow a ∈ Z d to be arbitrary.Now b ∈ Z d is restricted by the choice of a. Namely, if gcd(a, d) > 1, b cannot be 0 or share any prime factors with gcd(a, d).Each prime factor p of gcd(a, d) reduces the allowable set of b by a factor of (p − 1)/p = (1 − 1/p).We thus have the following equality over rationals: It turns out H is multiplicative, which we show in Appendix B.2.
Therefore, it is easy to evaluate H for prime powers: is the second of Jordan's totient functions [65].With H(d) in hand, we now conclude the calculation: This completes the proof of Theorem 4.1.
a In an undirected graph G(V, E), a clique is a subset of vertices with the property that any two distinct vertices have an edge connecting them.

Bounds on the size of non-commuting sets on n qudits
Suppose we have a non-commuting set S = {s i : i = 0, . . ., |S| − 1} on n qudits and another non-commuting set is a non-commuting set of size |S| + |S ′ | − 1 on n + n ′ qudits.We call this the Jordan-Wigner composition of sets S and S ′ , in analogy to the Jordan-Wigner encoding of fermions into qubits [26].The Jordan-Wigner composition gives a simple lower-bound on the size of non-commuting sets on n qudits.Proof.The Jordan-Wigner composition of t non-commuting sets S i , i = 0, . . ., t − 1, gives a noncommuting set of size . With n qudits, each supporting a maximum size noncommuting set S i with size |S i | = Ψ(d) (Theorem 4.1), this gives a lower bound of (Ψ(d) − 1)n + 1 on the size of the maximum non-commuting set.
If larger non-commuting sets are found in special cases, the lower bound can be improved.For instance, for d = 3, maximum non-commuting sets on n = 1 and n = 2 qutrits have sizes 4 and 7 (the case n = 2 was found via exhaustive computer search), respectively, matching the dn + 1 lower bound that Corollary 4.4 implies.But for n = 3, we have a computer-verified, size-13, maximum non-commuting set, where the Paulis are P (c) for each column c of the matrix The Jordan-Wigner composition therefore implies a lower-bound on non-commuting sets of qutrit Paulis of 12⌊n/3⌋ + 3(n mod 3) + 1.
Similarly, for d = 4 and n = 2, a maximum non-commuting set has size 20.This implies the lower bound of 19⌊n/2⌋ + 5(n mod 2) + 1 for the maximum size of non-commuting sets on n qudits with d = 4.

Achievable commutation relations
We established in Section 3 that the maximum number of non-commuting pairs that can exist on n qudits is nm.Thus, if (f 0 , . . ., f k−1 ) ∈ (Z d \ {0}) k is an achievable non-commuting pair relation on n qudits, we must have 1 ≤ k ≤ nm.Our first goal in this section is to further characterize exactly what tuples (f 0 , . . ., f k−1 ) are achievable non-commuting pair relations on n qudits.Then, we will use this characterization to construct Pauli multisets with given commutation relations.

Achievable tuples
) k is an achievable non-commuting pair relation on n qudits.On one qubit, Z fi , X d = f i , and so we can use a separate qubit to achieve each of the k non-commuting pair relations.See also Lemma C.1(iii) for a more formal argument.
Our first result obtains a lower bound on the minimum number of qudits needed to achieve a given non-commuting pair relation (f 0 , . . ., f k−1 ) ∈ (Z d \ {0}) k , that improves on the lower bound ⌈k/m⌉ implied by Theorem 3.2.Lemma 5.1.Let S = {s 0 , s 1 , . . ., s k−1 } and T = {t 0 , t 1 , . . ., t k−1 } be a collection of non-commuting pairs of size k on n qudits, and for every i, let f i := s i , t i .Define the multiset F j := {f i : Proof.If k ≤ n, then there is nothing to prove.So we assume k > n, and let j be arbitrary.For the sake of contradiction, suppose that ℓ := |F j | > n, and without loss of generality, one may then further assume that F j = {f 0 , f 1 , . . ., f ℓ−1 }.Define J := {0, . . ., ℓ − 1}, and then define the matrix M ∈ Z 2ℓ×2ℓ , as in the proof of Theorem 3.2.Then by exactly the same argument there, and using ℓ > n, we conclude that (i) det(M ) = 0, and (ii) every row and column of M has exactly one element not divisible by p αj j .But this is impossible by Lemma 3.1, which gives us the desired contradiction.

Remark. In Lemma 5.1, one can replace
and the conclusion still holds.This is because, for every i and j, we have s i , t i ≡ 0 (mod Remarkably, n = max j |F j | is also a sufficient number of qubits to achieve the non-commuting pair relation (f 0 , . . ., f k−1 ).We prove this below, but before that, we present an example of the construction.
Example 2. Suppose d = 30 = 2 × 3 × 5 (so m = 3 factors), n = 3, and f = (2,5,6,11,15).Then, F 0 = {5, 11, 15} (elements of f not divisible by 2), F 1 = {2, 5, 11} (elements not divisible by 3), and F 2 = {2, 6, 11} (elements not divisible by 5).We fill a |f | × m matrix Q whose elements are from Z n+1 subject to the constraints (1) Rows 0, . . ., 4 of Q are used to construct p 0 , . . ., p 4 , X-type Paulis, and q 0 , . . ., q 4 , Z-type Paulis, that form non-commuting pairs achieving f .To construct p i , a non-zero entry Q ij says that the prime factor labeled j should be excluded from the power of X on qubit Q ij .We do the same for q i , but with powers of Z and additional factors of −β i ∈ Z d in the exponents.For this example, Note that p i , q j d = 0 for i ̸ = j regardless of the values of the β i .The β i must be chosen so that It was not a coincidence we could find values of β i to make the desired non-commutation relations in the previous example.Lemma B.1 in the appendix gives the necessary statement in general.This also enables the following constructive proof that n = max j |F j | qubits are sufficient in general to achieve a non-commuting pair relation.

Theorem 5.2 (Minimum qudits achieving a non-commuting pair relation). Suppose that we are given
Then the minimum number of qudits needed for which f is an achievable non-commuting pair relation is max j |F j |.The non-commuting pairs generating f can be chosen to be CSS.Proof.By Lemma 5.1 n ≥ max j |F j | qudits are necessary to achieve f .To show that max j |F j | qudits are sufficient, consider a matrix Q ∈ Z k×m n+1 , where Q ij ̸ = 0 if and only if f i ∈ F j .Because n = max j |F j |, by choosing non-zero entries from Z n+1 \ {0}, we can arrange that within each column of Q, every non-zero entry is unique.We interpret these non-zero entries as qudit labels for qudits 1, 2, . . ., n.
For each row i = 0, . . ., k−1, we will now show how to construct a pair of Paulis X(u i ) and Z(v i ) that are supported only on qudits indicated in that row and X(u i ), Z(v i ) d = f i .Then the ordered sets {X(u 0 ), X(u 1 ), . . ., X(u k−1 )} and {Z(v 0 ), Z(v 1 ), . . ., Z(v k−1 )} are CSS non-commuting pairs generating f .Define the set of indices in row i where Q takes value h: S ih := {j : Q ij = h} and S ih := {j : Q ij ̸ = h}.By construction, p αj j divides f i for all j ∈ S i0 , and so f i = γ j∈Si0 p αj j for some integer γ.
where β is some integer (that depends on i) to be determined.So, we have ( Choose β = γβ ′ for another integer β ′ .Apply Lemma B.1 with d ′ = f i and where we note that for any j ̸ ∈ S i0 (i.e.exactly the set of indices for which p αj j does not divide f i ), p j does not divide d ′′ .To elaborate, p j divides all but one term in the sum in Eq. (19).Therefore, Lemma B.1 implies there is some integer Because within each column of Q the entries are unique, S ih ∪ S jh = {0, 1, . . ., m − 1}.Thus, d divides X(u i ), Z(v j ) and X(u i ), Z(v j ) d = 0.
Achievable non-commuting pair relations f of maximum size, i.e. |f | = nm, can be characterized more directly.We do so in Appendix C.

Qudits needed to achieve a matrix of commutation relations
If R is a principal ideal ring, and C ∈ R k×k is a matrix satisfying C ii = 0 and C ij = −C ji , for all i, j = 0, 1, . . ., k − 1, then such a matrix is called an alternating matrix over R. Suppose we are given an alternating matrix C ∈ Z k×k d and wish to find P ∈ Z k×2n d such that Here, rows of P can be interpreted as n-qudit Paulis possessing the commutation relations specified by C. The goal of this section is to answer the following question: what is the minimum number of qudits n for which such a P can be found?
The following lemma, proved constructively in appendix A, is useful in answering this question.

Lemma 5.3 (Alternating Smith Normal Form). Given an alternating matrix
, where B is alternating with at most one non-zero entry per row and column and L is invertible, such that A = LBL T .We may further arrange B so that it is non-zero only in the top-left 2r × 2r block which has the form , where M A is the Z d -submodule generated by the columns of A, and each β i ∈ Z d non-zero, satisfying β i | β i+1 for all i < r.Also, for all i = 1, . . ., r, β i is uniquely determined up to multiplication by a unit by the formula , where d j is the greatest common divisor of all j × j minors of A (and d 0 := 1).
Remark.Note that in the formulas for β i in the above lemma, both the minors and the greatest common divisor are first evaluated over integers, and then the division is also performed over integers.Finally the modulo d operation gives back an element of Z d .Similar to the remark following Lemma A.2, one should note that the smallest integer j for which d j /d j−1 ≡ 0 (mod d) is 2r + 1, and thus an odd integer (the fact that this integer is odd for any d is itself quite an interesting fact).In fact, if k is odd, then such an odd integer j must exist as d k = 0. We can also easily deduce that d j ′ ̸ = 0 for all j ′ < j.Furthermore, it follows from Lemma A.2 that for all j ′ ≥ j such that d j ′ −1 ̸ = 0, we also have Apply the Lemma to C, finding invertible L and alternating B such that C = LBL T .Then defining and let r = Θ(M C )/2 with M C denoting the submodule generated by the columns of C. Now, since B is alternating with at most one non-zero entry per row and column, the Paulis represented by the first 2r rows of Q are simply non-commuting pairs (the last k − 2r rows can be chosen to be all 0s, representing identity Paulis).In Theorem 5.2, we concluded that the necessary and sufficient number of qudits needed to achieve a set of non-commuting relations . ., r}}.Now there exists some p j so that β r ̸ ≡ 0 (mod p αj j ), because otherwise we would have β r = 0. Since β i |β i+1 for all i < r, if β r ̸ ≡ 0 (mod p αj j ), then β i ̸ ≡ 0 (mod p αj j ) for all i.Thus, max j |F j | = r qudits are necessary and sufficient to construct Q and also P = LQ.
The above establishes the following theorem: (ii) For every even j, the determinant (evaluated over integers) of the top-left j × j block of the matrix C is 1, which is easily verified by bringing C to upper-triangular form using row (or column) operations.In particular, if k is even, then Then by the chain of divisibilities condition mentioned in the remark following Lemma A.2, we conclude that for t = 1, we have d j = 1 for all j if k is even, while if k is odd, then we have d k = 0 and d j = 1 for all j < k.Now for arbitrary t ∈ Z d \ {0}, we simply note that d j equals t j times the value of d j for the case t = 1.Combining these facts, and using the formulas in Lemma 5.3, we obtain the following: From these observations, we immediately obtain the following corollary as a direct consequence of Theorem 5.4: Corollary 5.5.Let t ∈ Z d \ {0}.Then the largest size of a non-commuting set S on n-qudits, such that p, q d = ±t, for every distinct p, q ∈ S, is 2n + 1.

Some group theoretic results
In this section, we depart from the previous sections where we studied elements of P n , and instead we take up the study of the Heisenberg-Weyl Pauli group P n for a d-dimensional qudit, without ignoring the phases.We begin this section by establishing the notion of equivalent generating sets in Theorem 6.3.Using this, we give in Section 6.1 a characterization of a near minimal generating set of any subgroup of P n and also an algorithm to compute such a generating set.We also provide an (inefficient) algorithm to compute a minimal generating set of any subgroup of P n from a near minimal generating set.Next, in Section 6.2, we provide a way to compute a Gram-Schmidt generating set of any subgroup of P n .This generalizes the well-known stabilizerdestabilizer decomposition [43] of the qubit Pauli group to the qudit case.Finally, in Section 6.3 we develop some results to compute the size of any subgroup of P n .This generalizes similar results in [42], where only the special case of stabilizer subgroups was studied.Of particular note in this subsection is the square-free theorem (Theorem 6.15), which says that maximal sets of non-commuting pairs generate the qudit Pauli group when d is square-free.
Let us first introduce some notation that will help the discussion.All products in this section will be ordered, unless mentioned otherwise.Recall from Section 2.1 that any element p ∈ P n has the form ω j P (v) for some v ∈ Z 2n d and j ∈ Z d , where ω = e 2πi/d , and for a given p, the corresponding values of j and v are uniquely determined.Thus we can equivalently represent p by the tuple (j, v), and the map p → (j, v) sets up a bijection P n → Z d × Z 2n d .For any p ∈ P n , we define p 0 := I. Representing a Pauli as an element of Z d ×Z 2n d , we define the projection maps π 1 and π 2 onto the first and second factors respectively, i.e. π 1 ((j, v)) = j and π 2 ((j, v)) = v, for a Pauli represented by the tuple (j, v).Similarly, for an ordered multiset S := {q 0 , q 1 , . . ., q k−1 }, we use the notation π 1 (S) := {π 1 (q 0 ), π 1 (q 1 ), . . ., π 1 (q k−1 )}, and π 2 (S) := {π 2 (q 0 ), π 2 (q 1 ), . . ., π 2 (q k−1 )}.We may unambiguously associate π 2 (S) with a 2n × k matrix with elements in Z d , where the j th column is π 2 (q j ), and we will also denote the matrix by π 2 (S) when there is no chance for confusion.Now suppose we have two Paulis p, q ∈ P n represented by (j, v) and (k, w) respectively.Then it is an easy exercise to check that pq is represented by (ℓ, v + w), where ℓ = j + k + v T ( 0 0 I 0 ) w, evaluated modulo d, where I here is an n × n identity matrix.From this, one can also easily show that for t ≥ 1, p t is represented by (ℓ, tv), where ℓ = tj + t(t−1) 2 v T ( 0 0 I 0 ) v, evaluated modulo d, and thus computing both p + q and p t takes O(n) operations (for t bounded).The following well-known lemma is easy to establish (see also [62,64]): Thus in the case of odd d, the maximum possible order of any Pauli is d, while in the case of even d, the maximum possible order of a Pauli is 2d (for example, on 1 qudit, the Pauli XZ has order 2d when d is even).
Next we would like to figure out a way to transform a generating set of a subgroup of P n to another generating set of the same group with the same number of generators.Before we present the main theorem on this, let us consider the case of one generator.Suppose p ∈ P n generates a group G.For some unit s ∈ Z d , let us define q := p s .We can then show that q also generates G.
To see this let t ∈ Z d be the unit such that st = 1 in Z d , i.e. treating s, t as elements of Z we have st = kd + 1 for some non-negative integer k.Then q t = (p s ) t = p st = p kd+1 = p kd p.Now if d is odd, then p kd = I, and if d is even, then p kd = ±I by Lemma 6.1, and thus q t = ±p.If q t = p, then q generates G.The case q t = −p is more interesting.This is precisely the case when p d = −I and d is even, and thus q t(d+1) = q td q t = p, which again shows that q generates G. Thus we have proved that ⟨p⟩ = ⟨q⟩ = G.Theorem 6.3 generalizes this observation to multiple generators.We will also need Lemma 6.2, whose proof is easy and is left to the reader.
Notation: Recall that [p, q] is the group commutator of p, q ∈ P n .Let S ⊆ P n be a set or multiset.We will denote I S := {p ∈ ⟨S⟩ : π 2 (p) = 0} to be the set of all elements of ⟨S⟩ that are proportional to I. We will also denote J S := {[p, p ′ ] : p, p ′ ∈ S} ∪ {p d : p ∈ S}.Then I S is a subgroup and moreover we have ⟨J S ⟩ ⊆ I S ⊆ ⟨S⟩.Lemma 6.2.Suppose S := {q 0 , q 1 , . . ., q k−1 } ⊆ P n is a multiset.Then (iv) Let T := {q ′ 0 , q ′ 1 , . . ., q ′ k−1 } ⊆ P n be a different multiset such that q ′ j = q j ω βj , for some β j ∈ Z d for each j.Then ⟨J T ⟩ = ⟨J S ⟩.Theorem 6.3 (Equivalent generating sets).Suppose S := {q 0 , q 1 , . . .q k−1 } is an ordered multiset of elements of P n .Let A ∈ Z k×k d be an invertible matrix, and consider the ordered multiset T := {q ′ 0 , q ′ 1 , . . ., q ′ k−1 } ⊆ P n , where we define Proof.This follows a similar line as the one generator example above.Keeping track of the phases that is facilitated by Lemma 6.2.A full proof is given in Appendix B.4.
Our next goal is to provide a complete characterization of the subgroup of phases I S , given a generating set S ⊆ P n .For instance, for any p, q ∈ S, both [p, q] and p d are proportional to identity and thus members of I S .There may be additional products of generators that are proportional to identity as well, and these can be read off the kernel of π 2 (S).That these are all the elements of I S is the content of the next lemma.
Recall that given a matrix A ∈ Z k×ℓ d , the kernel of A is a submodule of Z ℓ d defined as ker(A) := {v ∈ Z ℓ d : Av = 0}.Lemma 6.4.Suppose S := {q 0 , q 1 , . . ., q k−1 } ⊆ P n is an ordered multiset.Consider the matrix π 2 (S) ∈ Z 2n×k d , and let K ∈ Z k×ℓ d be such that the columns of K is a generating set for ker(π 2 (S)).Then we have the following: The group K = ⟨ωI⟩ is isomorphic to the additive group on Z d .Subgroups of Z d have the property that they can always be generated by one element (in other words, Z d is a principal ideal ring).Thus, the following lemma says the same is true of subgroups of K. Lemma 6.5.Let S := {ω µj I : j = 0, 1, . . ., t, µ j ∈ Z d } ⊂ P n .Let I be the ideal generated by {µ 0 , . . ., µ t }.Then (i) ⟨S⟩ = {ω µ I : µ ∈ I}.
Algorithm 1 Return a generator for I S given S ⊆ P n 1: procedure Identity Generator(S := {q 0 , q 1 , . . ., q k−1 }, d) if d is even then ▷ Generate phases of I from d th powers of generators 4: t ← π 1 (q d j ) 6: µ ← d/2; Exit for loop return ωI 10: 11: t ← π 1 ([q j , q ℓ ]) 13: µ ← gcd(t, µ) t ← π 1 (q) 22: µ ← gcd(t, µ) return ω µ I Lemma 6.4 and Lemma 6.5 leads to a simple algorithm (Algorithm 1) to find a generator for I S given a multiset S ⊆ P n .The basic idea is to get a generating set for I S and then use Lemma 6.5(ii).In the algorithm, gcd(a, b) refers to the evaluation of the greatest common divisor over integers, for integers 0 ≤ a, b ≤ d, which takes O(M (d) log(log(d))) operations, where M (d) is the number of operations needed to multiply two integers no greater than d (so M (d) = O(log 2 d) for elementary-school arithmetic, for instance) [53].In order to facilitate early termination, instead of evaluating all the generators and then evaluating the gcd using Lemma 6.5(ii) of the whole list, we use the property gcd(a, b, c) = gcd(a, gcd(b, c)) recursively, to update the gcd everytime we have a new generator.If at any stage the gcd becomes 1, we can terminate the algorithm (Lines 8-9, Lines 14-15, Lines 23-24).In Lines 3-7, we use the fact from Lemma 6.1 that for odd d, any p ∈ P n satisfies p d = I, and for even d satisfies p d = ±I.Lines 6-7 exploits this fact, and terminates checking the d th powers of the remaining generators, if a generator q j ∈ S is detected such that q d j = −I.In Lines 11-15, notice that we exploit the fact that π 1 ([q j , q ℓ ]) + π 1 ([q ℓ , q j ]) ≡ 0 (mod d), or equivalently [q j , q ℓ ] = [q ℓ , q j ] −1 ; thus only the commutators [q j , q ℓ ] need to be considered for j < ℓ.In Line 17, the computation of the kernel matrix K can be carried out using the techniques in [53,Chapter 5].In terms of computational cost, the computation of q d j and [q j , q ℓ ] in Line 5 and Line 12 respectively involves O(n) operations, while computation of q is Line 20 involves O(kn) operations.

Minimal and near-minimal generating sets
We are now in a position to answer the following question: given a multiset S ⊆ P n , can we find a non-empty generating set of ⟨S⟩ of the smallest size?Such a set is called a minimal generating set of ⟨S⟩.The "non-empty" condition is only relevant for the case ⟨S⟩ = {I}.This subsection is dedicated to providing a nearly complete solution to this problem.We begin by stating a result from commutative algebra that will be needed below, whose proof can be found in [66] (we thank Jeremy Rickard for the outline of the proof): Now for the rest of this subsection, suppose S := {s 0 , s 1 , . . ., s k−1 } ⊆ P n is an ordered mutltiset.The case when π 2 (S) has zero invariant factors is special.In this case, we have ⟨S⟩ = I S , and then a minimal generating set of I S can be obtained using Lemma 6.5(ii).Thus, through the remainder of this section we assume that π 2 (S) has at least one invariant factor.The following lemma is easy to establish, which gives us a lower and upper bound on the size of a minimal generating set of ⟨S⟩.Proof.This follows from Lemma B.4 in Appendix B.4.The crux is similar to the canonical generating set in [42], but finds a canonical generating set for an arbitrary group rather than just a qudit stabilizer group, and does not change the Pauli basis using Clifford operators.
Based on this result, we make the following definition.Definition 6.1.Given an ordered multiset S := {s 0 , s 1 , . . ., s k−1 } ⊆ P n , such that r is the number of invariant factors of π 2 (S), we call a subset T ⊆ ⟨S⟩ a near-minimal generating set of ⟨S⟩, if T is of the form T := T ′ ∪ {p} and satisfies (i) ⟨T ⟩ = ⟨S⟩, (ii) ⟨p⟩ = I S , and (iii) |T ′ | = r.Note that if r = 0, then T ′ is empty; so this definition also works for that case.
Lemma B.4 gives us a way to compute a near-minimal generating set of ⟨S⟩.At this point, it begs the question of whether a near-minimal generating set of ⟨S⟩ is also a minimal generating set of ⟨S⟩ or not.While we do not completely resolve this question here, we prove some partial results below in the remainder of this subsection.Let us first show that indeed there are cases where a near-minimal generating set is a minimal generating set.An easy example is the case when the number of invariant factors of π 2 (S) is zero.In this case, a near-minimal generating set of ⟨S⟩ has size one, and hence it is a minimal generating set.We give another example below.shows that the number of its invariant factors is one.Thus a near-minimal generating set of ⟨S⟩ has size two.Suppose for contradiction that there exists a minimal generating set T of ⟨S⟩ with |T | = 1.Since the maximum possible order of any element of P n is 2d (Lemma 6.1), this implies that |⟨T ⟩| ≤ 2d < d 2 .So T cannot generate ⟨S⟩.Thus all near-minimal generating sets of ⟨S⟩ are also minimal generating sets of ⟨S⟩ in this example.
Let us now give an example where a near-minimal generating set is not a minimal generating set, i.e. there exists generating sets of ⟨S⟩ of exactly size r ≥ 1, where r is the number of invariant factors of π 2 (S).One such example is when r = k in Lemma 6.8 (then clearly S is itself minimal).Another one is given below in Example 4.
Example 4. Assume that ⟨S⟩ is a stabilizer group and ⟨S⟩ ̸ = {I}.In this case, if S is any generating set of ⟨S⟩, we know the size of a minimal generating set of ⟨S⟩ is exactly the number of invariant factors r of π 2 (S).This is because a near-minimal generating set T ′ ∪ {p} of G must satisfy p = I, and thus ⟨T ′ ∪ {p}⟩ = ⟨T ′ ⟩.
How far are we from computing a minimal generating set of ⟨S⟩, given we have a near-minimal generating set of ⟨S⟩?The next result gives a nice structure theorem to find minimal generating sets from near-minimal ones.Theorem 6.9.Given S ⊆ P n , suppose that T := T ′ ∪ {p} is a near-minimal generating set of ⟨S⟩ with ⟨p⟩ = I S .Let the number of invariant factors of π 2 (S) be r ≥ 1, and suppose T ′ = {q 0 , q 1 , . . ., q r−1 }.Then the following conditions are equivalent.
(i) T is not a minimal generating set of ⟨S⟩.
Proof.Denote by M the submodule of Z 2n d generated by the columns of π 2 (S).We first prove that (i) implies (ii).Suppose that U := {u 0 , u 1 , . . ., u r−1 } is a minimal generating set of ⟨S⟩.By Lemma B.4(ii), (v), we know that the columns of both π 2 (T ′ ) and π 2 (U ) generate M .Thus by Lemma 6.6 we can conclude that there exists an invertible matrix C ∈ Z r×r d such that π 2 (T ′ ) = π 2 (U )C.Now define a new set T ′′ := {t 0 , t 1 , . . ., t r−1 } such that t i = r−1 j=0 u Cji j , for every i.Lemma 6.3 then implies that ⟨T ′′ ⟩ = ⟨U ⟩ = ⟨S⟩.It also follows from the equality π 2 (T ′ ) = π 2 (U )C that π 2 (t i ) = π 2 (q i ) for every i.Thus each t i is equivalent to q i up to some phase factor, and since ⟨p⟩ = I S , we immediately conclude that t i = p γi q i for some γ i ∈ Z d , for every i.Now assume (ii) is true.As ⟨T ′′ ⟩ = ⟨S⟩, this implies that p ∈ ⟨T ′′ ⟩, proving (iii).
Finally assume that (iii) is true, and we want to prove (i).Clearly the definition of T ′′ in (iii) implies that ⟨T ′′ ⟩ ⊆ ⟨S⟩, because p, q i ∈ T , and hence p γi q i ∈ ⟨S⟩ for every i.To prove the reverse containment, note that p ∈ ⟨T ′′ ⟩ implies that q i ∈ ⟨T ′′ ⟩ for every i (since p γi i q i ∈ T ′′ by definition).This shows that T ⊆ ⟨T ′′ ⟩ and thus ⟨S⟩ ⊆ ⟨T ′′ ⟩.Thus T ′′ is a minimal generating set of ⟨S⟩ (since it has size r), and this proves (i).

4:
p ← Identity Generator(T ′′ , d) return T ′ ∪ {p} Condition (iii) of Theorem 6.9 can be used to obtain a simple (but inefficient) algorithm to test whether a near-minimal generating set of ⟨S⟩ is also a minimal generating set or not, and then output a minimal generating set of ⟨S⟩.This is given in Algorithm 2. We assume that π 2 (S) has r ≥ 1 invariant factors, so that a computed near-minimal generating set (using Lemma B.4) of ⟨S⟩ has size r + 1. Suppose here that r < |S|, so that we are in the setting of Lemma B.4(v).Let T = T ′ ∪{p} be one such near-minimal generating set with ⟨p⟩ = I S , and let T ′ = {q 0 , q 1 , . . ., q r−1 }.The algorithm proceeds by looping over each r-tuple (γ 0 , γ 1 , . . ., γ r−1 ) ∈ Z r d , and then constructing the set T ′′ ⊆ ⟨S⟩ of size r.If T ′′ is a minimal generating set of ⟨S⟩, then by Theorem 6.9 we must have p ∈ I T ′′ .To check this condition, in Lines 3-5 we use Algorithm 1 to compute p := ω β I with the property that β ∈ Z d is the smallest possible integer satisfying ⟨p⟩ = I T ′′ .Then if p = ω δ I for some δ ∈ Z d , it follows that p ∈ I T ′′ if and only if β divides δ.If this condition check succeeds for any (γ 0 , γ 1 , . . ., γ r−1 ) ∈ Z r d , then we have found a minimal generating set T ′′ of ⟨S⟩ of size r, and otherwise we conclude that T = T ′ ∪ {p} is a minimal generating set of ⟨S⟩ of size r + 1.
Another thing to note about Algorithm 2 is that the computational work in Line 4 can be significantly reduced by precomputing and storing a few quantities.Note that due to Lemma 6.2(iv), the results of Lines 2-17 (of Algorithm 1) in the execution of IDENTITY GENERATOR do not change irrespective of T ′′ (or equivalently irrespective of the choice of the r-tuple (γ 0 , γ 1 , . . ., γ r−1 ) ∈ Z r d in Line 2 of Algorithm 2) -thus one can compute and store K and the µ value (let us call this µ 0 ) up to this point (Line 17), with the choice γ 0 = γ 1 = • • • = γ r−1 = 0 (i.e.T ′′ = T ′ ).Subsequently, every time IDENTITY GENERATOR gets called in Line 4 of Algorithm 2, we can initialize Algorithm 1 at Line 19, using the precomputed K and setting µ = µ 0 .In fact, if µ 0 = 1, then there is nothing to compute -we know that IDENTITY GENERATOR will return p = ωI, which also means that the membership check in Line 5 of Algorithm 2 will succeed.For the case µ 0 ̸ = 1, there are also ways to speed up the execution of Lines 19-26 of Algorithm 1.The main bottleneck is Line 20 which has a complexity of O(rn) operations.But this can be reduced to O(r) using another precomputation step: note that we need to compute q ← r−1 ℓ=0 (p γ ℓ q ℓ ) K ℓj in Line 20, which can be simplified as γ ℓ K ℓj , and thus the quantity r−1 ℓ=0 q K ℓj ℓ can be precomputed for each j = 0, 1, . . ., s − 1 (here s is the number of columns of K).Putting everything together, and ignoring the precomputation step, we see that Algorithm 2 has a run time complexity of O(sd r (r + M (d) log log(d))), where M (d) is the cost of integer multiplication of non-negative integers less than d.The computational complexity of the precomputation step is upper bounded by the complexity of Algorithm 1.The exponential factor of d r in the complexity of Algorithm 2 is undesirable, and coming up with a more efficient algorithm is left for future work.
We finish this subsection by stating a special case of Theorem 6.9 below, when d is prime.In this case, the theorem simplifies.Lemma 6.10.Given S ⊆ P n , suppose that T := T ′ ∪ {p} is a near-minimal generating set of ⟨S⟩ with ⟨p⟩ = I S .Let d be prime.Then the following conditions are equivalent.
(i) T is a minimal generating set of ⟨S⟩.
(ii) ⟨T ′ ⟩ is a stabilizer subgroup of P n , and I S ̸ = {I}.

A Gram-Schmidt generating set of a subgroup
The symplectic Gram-Schmidt procedure (described in [25] for qubits and easily extended to prime d) takes a generating set S for a Pauli subgroup G = ⟨S⟩ and returns another generating set S ′ = S 1 ∪ S 2 ∪ U , where S 1 , S 2 is a collection of non-commuting pairs (in the qubit case, anticommuting pairs), U is a subset of the center Z(G) of G, b and G = ⟨S ′ ⟩.Note, in particular, that for such a generating set S 1 , S 2 , and U must be disjoint.Here we describe a procedure achieving the same result, a Gram-Schmidt generating set, for qudit Pauli groups.
b Recall that, for a group G, its center Z(G) is the subgroup of elements in G that commute with everything in G.
with ordering of the product irrelevant to our result.Because the commutation relations of S are given by A = QΛQ T , the commutation relations of S ′ are given by ( Moreover, the invertibility of L gives ⟨S⟩ = ⟨S ′ ⟩ via Theorem 6.3.Thus, this procedure gives the required Gram-Schmidt generating set.Now we prove that there is no Gram-Schmidt generating set containing fewer non-commuting pairs.Suppose for contradiction, there exists sets T, H ⊆ G such that ⟨T, H⟩ = G, |T | < 2r, and H ⊆ Z(G).Construct a multiset E = T ∪ J ∪ H, where J is a multiset of 2r − |T | phaseless identity Paulis.Let P, P ′ ∈ Z (2r+|H|+|U |)×2n d be matrices such that the rows of P (resp.P ′ ) represent the Paulis in the multiset E ∪ U (resp.S ′ ∪ H), modulo the phase factors.We first note that since ⟨E, U ⟩ = ⟨S ′ , H⟩ = G, this implies that the submodules generated by the columns of P T and P ′T are equal.Thus, by Lemma 6.6 there exists an invertible matrix V such that P = V P ′ .The number of non-zero entries in the ASNF of B ′ = ( B 0 0 0 ) = P ′ ΛP ′T is 2r, and by the uniqueness part of the ASNF in Lemma 5.3, this should equal the number of non-zero entries in the ASNF of P ΛP T = V B ′ V T .However, by the assumptions on the sets T and H, where D ∈ Z

|T |×|T | d
, which in turn implies that C cannot have more than |T | < 2r non-zero entries in its ASNF, obtaining our contradiction.
With Lemma 6.11, we can make a Gram-Schmidt generating set S 1 ∪ S 2 ∪ U with minimally sized S 1 and S 2 .However, there is no such minimality guarantee on U .If d is prime, the only elements of Z(G) that can be generated by ⟨S 1 , S 2 ⟩ are exactly those in K = {ω j I : j ∈ Z d }.If d is not prime, it is possible for non-trivial elements of Z(G) to be generated by ⟨S 1 , S 2 ⟩, including elements of U .The ways this can happen is limited however.Lemma 6.12.Suppose S = {s 0 , . . ., s k−1 }, T = {t 0 , . . .t k−1 } is a collection of non-commuting pairs, f i = s i , t i for every i, and let H = ⟨S, T ⟩.Then Proof.Let A = ⟨ω fi I, s ai i , t ai i : i = 0, . . ., k − 1, a i ∈ Z s.t. a i f i ≡ 0 (mod d)⟩ and note that J H ⊆ A. We see that ω fi I = [s i , t i ], s ai i and t ai i are all in Z(H) because they are products of generators of H and they commute with all generators of H. Thus, A ⊆ Z(H).
To show containment in the other direction, let p ∈ Z(H).Therefore, p can be written in terms of the generating set of H as for some a i , b i ∈ Z d and ω c I ∈ J H ⊆ A. Evaluate the commutator: p, t j = s aj j , t j = a j s j , t j = a j f j .Since p ∈ Z(H) the value of this commutator modulo d must be 0, implying a j f j ≡ 0 (mod d).This argument was independent of j, and an analogous argument works for commutators with s j , implying b j f j ≡ 0 (mod d).Therefore, p ∈ A, completing the proof.
Let U = {u 0 , . ., u k−1 }.One can use the Howell normal form and the approach outlined in Section 2.2.4 to check membership of π 2 (u 0 ) in Z(⟨S 1 , S 2 ⟩) and, if it is, check that the phase can be corrected by an element of I ⟨S1,S2⟩ .If that is also the case, then u 0 is redundant and can be removed from U .Otherwise, continue by checking whether u 1 is in Z(⟨S 1 , S 2 , u 0 ⟩) and so forth.

Sizes of subgroups of P n
In this section, we investigate the sizes of subgroups of P n given their generating sets.First, let's assume nothing about the generating set.Using the Smith normal form, we can prove the following.Lemma 6.13 (Pauli subgroup size).Given an ordered multiset S := {q 0 , q 1 , . . ., q k−1 } ⊆ P n , suppose that the invariant factors of π 2 (S) Proof.Firstly, note that since invariant factors are unique up to units in Z d , the sizes of the ideals |d i Z d | do not depend on this choice; so the expression for |⟨S⟩| is well-defined.Thus, let π It follows from the structure theorem of finitely generated modules over a principal ideal ring [47,Chapter 15] But it is possible to prove the same in an elementary fashion.Since Q is invertible, any non-empty subset of the columns of Q generate a free submodule.Thus Q 0 , Q 1 , . . ., Q r−1 form a basis, and it follows that In the special case that S is a qudit stabilizer group, i.e. an abelian subgroup of the n-qudit Pauli group, there is a codespace consisting of qudit states that are +1 eigenvectors of all the elements of S. This codespace has dimension d n /|⟨S⟩| [42], which we can efficiently calculate using Lemma 6.13.
Next, consider subgroups of P n that are generated by non-commuting pairs of qudit Paulis.Such subgroups enjoy some surprising properties.For example, if S = {s 0 , s 1 , . . ., s k−1 } and T = {t 0 , t 1 , . . ., t k−1 } is a set of non-commuting pairs, then S∪T is an independent set of generators, in the sense that the group generated by removing any generator from S ∪ T leads to a proper subgroup of ⟨S, T ⟩.To see this note that if s 0 ∈ ⟨S \ {s 0 }, T ⟩, then it must imply that s 0 , t 0 d = 0, which is a contradiction.We could have argued the same with any other generator of S ∪ T .
A particularly interesting result is Theorem 6.15, where it is shown that a maximum collection of non-commuting pairs generate the entire qudit Pauli group P n when d is square free.The following lemma proves useful.Lemma 6.14.Let p, q ∈ P n and G ⊆ P n .If every element in G commutes with both p and q and p, q Proof.The second equality differs from the first in that it includes arbitrary phases, which one can create from products of p and q: {(pqp † q † ) j : j = 0, 1, . . ., d − 1} = {ω jc I : To prove the first equality, note |⟨p⟩/I {p} | = |⟨q⟩/I {q} | = d.If |⟨p⟩/I {p} | were not d, then p a ∈ K for 0 < a < d and thus 0 = p a , q d ≡ ac (mod d), contradicting the fact that c is a unit.Next, we see that for any j ̸ ≡ 0 (mod d), p j is not in ⟨q, G⟩ even up to a phase.If it were, then p j would commute with q, but p j , q d = jc mod d, which is not zero because c is a unit and j ̸ ≡ 0 (mod d).Likewise, q j is not in ⟨G⟩ up to a phase for any j ̸ ≡ 0 (mod d).Proof.A collection of nm non-commuting pairs is a maximum size such collection, characterized by Theorem C.2.This implies that, for the appropriate order of the set S ∪ T , the commutator matrix C is block-diagonal with 2×2 blocks: , and gcd(f ij , d) = h̸ =j p h for all i, j.
Put C into alternating Smith normal form C = LBL T using Lemma 5.3.Due to the simple form of C, it is easy to evaluate d 2i , the greatest common divisors of all 2i × 2i matrix minors of C: Therefore, for appropriate invertible L, we have B = n i=1 0 1 −1 0 .The Smith normal form implies the existence of a collection of non-commuting pairs, , and G ⊆ P n that commutes with every element of S ′ and T ′ .Moreover ⟨S ′ , T ′ , G⟩ = ⟨S, T ⟩.More specifically, S ′ , T ′ , and G can be found from S and T by taking appropriate products specified by L.
By Lemma 6.14, |⟨S ′ , T ′ ⟩| = d 2n+1 .This is the size of P n .Therefore, G = {I} and A collection of non-commuting pairs S, T that generates the full Heisenberg-Weyl Pauli group P n is a convenient generating set B = S ∪ T = {s 0 , t 0 , . . ., s k−1 , t k−1 } for the entire group.If we have p ∈ P n , it has some decomposition in terms of the generators in B which can be rearranged to where ω c I ∈ J B and a i , b i ∈ Z d .We can write ω c I in terms of a product of the generators by using a modified form of Algorithm 1 that records which product of generators gives ωI.Also, using the fact that S, T are non-commuting pairs, we have p, t i = a i s i , t i , for every i = 0, 1, . . ., k − 1.
In particular, a i = p, t i / s i , t i mod d must be an integer.Likewise, we have b i = s i , p / s i , t i mod d, for every i.This makes the decomposition of p into generators from B rather simple, just amounting to calculation of s i , p and p, t i , and the inclusion of an overall phase.
To conclude this section, we note that for more general non-square-free d, we can get lower bounds on the group size generated by non-commuting pairs.We state this as a corollary of Lemma B.3, which we state and prove in Appendix B.4. Corollary 6.16.Suppose S = {s 0 , . . ., s k−1 }, T = {t 0 , . . .t k−1 } is a collection of non-commuting pairs.For every i, let f i = s i , t i d , and denote the order of

Discussion
We have applied the theory of modules over commutative rings to study properties of the qudit Pauli group.One problem we have left open is the maximum size of a non-commuting set on n > 1 qudits.Though we have provided some lower bounds on the size, we are also lacking a nontrivial upper bound.A second problem that we left open is the efficient construction of minimal generating sets of qudit Pauli subgroups in all cases.The missing step to get from a near-minimal generating set to a minimal one involves determining whether appropriate phases can be added to each generator as in Theorem 6.9.
Further work on qudit group theory might involve applying techniques similar to those we developed here to selecting an element of the qudit Clifford group uniformly at random.If this line of work parallels similar developments in the qubit case [67,68], interesting structure of qudit Clifford group could be discovered.

A.1 Bilinear alternating forms
For a commutative ring R with multiplicative identity, consider a R-module M .Let g : M ×M → R be a bilinear form.This means that for all x, y, z ∈ M and a ∈ R, we have (i) g(x + y, z) = g(x, z) + g(y, z), (ii) g(x, y + z) = g(x, y) + g(x, z), and (iii) g(a • x, y) = g(x, a • y) = ag(x, y).We say that g is alternating if g(x, x) = 0 for all x ∈ M , which implies g(x, y) + g(y, x) = 0 for all x, y ∈ M .A key fact about alternating bilinear forms is the following theorem: Exercise 17]).Let M be a finitely generated free module of rank k over a commutative ring R with multiplicative identity, and let g : M × M → R be a bilinear alternating form.If R is a PIR, then there exists a basis {e i : i = 0, 1, . . ., k − 1} of M , and a non-negative integer r with 2r ≤ k, such that (i) g(e 2i , e 2i+1 ) = −g(e 2i+1 , e 2i ) = a i ̸ = 0, for i = 0, 1, . . ., r − 1, where a 0 , a 1 , . . ., a r−1 ∈ R, while g(e i , e j ) = 0 for all other pairs of indices i, j, Moreover, the ideals {ba i : b ∈ R} are uniquely determined, for every i = 0, 1, . . ., r − 1, irrespective of the choice of the basis.
Now suppose that R is a principal ideal ring, and A ∈ R k×k is an alternating matrix over R (see beginning of Section 5.2 for definition of an alternating matrix).Then the module R k is a finitely generated free R-module of rank k, as mentioned previously.The matrix A defines a bilinear alternating form g : R k × R k → R as follows: We can then apply Theorem A.1 to this setting to extract a basis {e i : i = 0, 1, . . ., k − 1} of R k , a non-negative integer r ≤ ⌊k/2⌋, and a 0 , a 1 , . . ., a r−1 ∈ R as in the theorem, such that B ∈ R k×k defined by the equation LBL T = A is an alternating matrix, where the columns of (L −1 ) T ∈ R k×k are e 0 , e 1 , . . ., e k−1 , and Furthermore, by applying [47,Corollary 5.16] we can conclude that L is an invertible matrix over R, and then by Lemma 2.1 Lemma 2.2 we also know that Θ(M C ) = Θ(M A ) = 2r, where M A and M C are the submodules of R k generated by the columns of A and C respectively.This decomposition A = LBL T is known as the alternating Smith Normal form (ASNF), for example see [24,Theorem 18] where it is done assuming R is a PID.In this paper, we are interested in the specific cases when R is either Z or Z d .In these cases, one even has an explicit formula for the non-zero entries of B in the ASNF, in terms of the minors of the matrix C. In the next subsection, Lemma A.2 summarizes all these facts when R = Z, and provides a complete proof that has the added advantage of specifying an explicit algorithm to compute the ASNF.We also discuss how Lemma A.2 easily generalizes to the case R = Z d , leading to Lemma 5.3, which we already encountered before.

A.2 An algorithm for the ASNF
For the following lemma, for a, b ∈ Z we define the greatest common divisor of a and b to be the largest positive integer that divides both a and b over Z, and denote it gcd(a, b).We also define gcd(0, 0) = 0, and gcd(a) = a for any a ∈ Z.
Lemma A.2. Suppose A ∈ Z k×k is an alternating matrix.Then, there are matrices L, B ∈ Z k×k , where B is alternating and has at most one non-zero entry per row and column, and L is invertible, such that A = LBL T .Moreover, we may further arrange B so that it is non-zero only in the topleft 2r × 2r block which has the form r i=1 , where M A is the Z-submodule generated by the columns of A, and each β i ∈ Z non-zero, satisfying β i | β i+1 for all i < r.Moreover, for all i = 1, 2, . . ., r, we have , and for all i > 2r we have d i = 0, where d j is the greatest common divisor of all j × j minors of A (with d 0 := 1), which implies that the β i are unique up to choice of ± sign.

Remark. In the context of the lemma above, we have a chain of divisibilities
This is because for any j ≥ 0, any (j + 1) × (j + 1) minor of A is divisible by d j , and thus d j must also divide d j+1 .This means that if there is any j ′ ≤ k such that d j ′ = 0, then it automatically implies that d j = 0 for all j > j ′ .Suppose such a j ′ exists and moreover assume that d j ̸ = 0 for all j < j ′ (for example if k is odd, then d k = det(A) = 0 by property of alternating matrices).In other words, suppose j ′ is the smallest integer such that all j ′ × j ′ minors of A are zero.Then we can see from the above lemma that j ′ = 2r + 1, and hence must be odd.Thus we also conclude that This lemma appears in more or less general forms elsewhere.As alluded in the previous section, it is a consequence of Theorem A.1 from [46].Part of this lemma is also Theorem 18 in [24] applied to the ring of integers.However, we go further and include the formulas for β i .Finally, the analogous lemma for matrices over F 2 appears as Lemma B.1 in [22].After the proof, we discuss two variations on the theme: Lemma 5.3, which replaces the ring Z with Z d , and a generalization to matrices over any principal ideal ring (of which Z d is just one).
Proof of Lemma A.2.The construction of B and L is iterative.We let B 0 = A. Entries of a matrix that are the only non-zero entry in both their row and column we call "pivotal" entries.Rows and columns containing those entries are also called pivotal.Suppose B i−1 is an antisymmetric matrix with zero diagonal and each of its first i − 1 rows and columns are either pivotal or all zero.Because it is antisymmetric with zero diagonal, the total number of pivotal rows (counting also those outside the first i − 1 rows, if any) is even and equals the number of pivotal columns.See Fig. 1 for a depiction of the structure of B i−1 .
Starting with B i−1 , we will construct a matrix B i = L i B i−1 L T i with the same properties for one additional row and column, and L i is invertible.The sets of pivotal rows and columns of B i will contain the sets of pivotal rows and columns of B i−1 .For notational simplicity we denote If row i of B ′ is already pivotal or all zero, then we can just set B i = B ′ and L i to the identity matrix.Otherwise, there exists a smallest j such that B ′ ji = −B ′ ij ̸ = 0. Note that j > i; otherwise, row i must already be pivotal.
We now alternate two subroutines until a termination condition is met.The first subroutine zeroes out entries in the column i other than B ′ ji , and the second zeroes out entries in the column j other than B ′ ij .Updates are made to B ′ throughout each subroutine, but it always remains antisymmetric with zero diagonal.Thus, the subroutines also simultaneously zero out entries in row i and row j, respectively.
The fundamental task in both subroutines is the same -we have two non-zero entries a and b in the same column, and wish to take linear combinations of their rows so that a is replaced by gcd(a, b) > 0 and b is replaced by 0. By Bezout's identity, there exist integers x, y such that ax + by = gcd(a, b) (and these can be found by performing the extended version of Euclid's GCD algorithm).Let z = a/ gcd(a, b) and w = b/ gcd(a, b).Then, Let K ∈ Z k×k be the identity matrix except for the entries The first subroutine sequentially zeros out all entries B ′ j ′ i ̸ = 0 for j ′ ̸ = j, leaving a non-zero entry at just position (j, i) in column i.The second subroutine sequentially zeros out all entries B ′ i ′ j for i ′ ̸ = i, leaving a non-zero entry at just position (i, j) in column j.Note, the second subroutine may cause entries B ′ j ′ i to become non-zero again, undoing some of the work of the first subroutine, and likewise the first subroutine may cause entries B i ′ j to become non-zero.For this reason we must alternate application of subroutines until rows i and j are both pivotal.This termination is guaranteed -the absolute values of entries at (i, j) and (j, i) are non-increasing because gcd(a, b) divides |a|.Moreover, if gcd(a, b) = |a| (and this holds for all entries b we are trying to zero out in this iteration), then we may choose y = 0 in Bezout's identity and the row containing a remains unchanged (in absolute value) -that is, the work of the previously applied subroutine is not undone, the rows i and j have been made pivotal, and we have constructed B i and L i (the latter a product of the appropriate sequence of matrices K above).
After k iterations, we have obtained invertible matrices L 1 , . . ., L k such that where B k is antisymmetric with zero diagonal and at most one non-zero entry per row and column.There is a permutation matrix P such that B k+1 = P B k P T = r i=1 0 γi −γi 0 for some integers r and γ i .We are still lacking the divisibility condition on entries of B, which we rectify next.

Consider the matrix of integers
Suppose we have Bezout's identity ax + by = gcd(a, b).We claim there exists invertible matrix H such that This is exemplified by the following sequence of row and column operations: The first step adds row 3 to row 1 (and likewise on columns).The second uses Eq. ( 31) to zero out entries (1,4) and (4, 1).The third step zeros entries at (2, 3) and (3, 2) by subtracting by/ gcd(a, b) times row 1 from row 3 (and likewise on columns).
Apply this procedure iteratively to pairs of 2 × 2 blocks of B k+1 .In iteration i, apply it sequentially r − i times to block i paired with each of blocks j = i + 1, . . ., r.After iteration i, the non-zero entry of the i th block divides all entries in blocks i + 1, . . ., r.After iteration r − 1, we have an invertible H so that , so that A = LBL T .Note that by combination of Lemmas 2.1 and 2.2, we deduce that r = Θ(M A )/2.

To show that |β
we note that invertible row and column operations do not change the greatest common divisor of matrix minors (see Corollary 4.8 in [47] or the proof of Proposition 8.1 in [69]).Thus, since the formulas for β i are easily verified using matrix minors of B = r i=1 0 βi −βi 0 , they hold also for A. We summarize the construction in this proof in Algorithm 3.
To start the commentary on Lemma A.2, let us apply it more specifically to the ring Z d that is relevant for d-dimensional qudits.Suppose we have an alternating matrix Ā ∈ Z k×k d .Treating Ā as an element of Z k×k , we use Lemma A.2 to find matrices L, B ∈ Z k×k so that Ā = L B LT .We note that invertibility of L implies that det( L) = ±1 over integers, since ±1 are the only units in Z.We may reduce this expression modulo d to obtain the equality A = LBL T , where A, B, and L are obtained by reducing Ā, B, and L modulo d, respectively.Note that Ā = A, and L is an invertible matrix in Z for h ∈ S do 10: x, y, g = ExtendedGCD(B ij , B ih ) for l ∈ T do 16: x, y, g = ExtendedGCD(B ij , B lj )

21:
P ← permutation matrix so that P BP T is block diagonal with r non-zero 2 × 2 blocks 22: H ← the matrix enacting the transformation in Eq. ( 35) on blocks i and j 27: A greatest common divisor gcd(a, b) for a, b ∈ Z d is any element of Z d that divides and b and has the property that if c ∈ Z d divides both a and b, then c divides gcd(a, b).The group of units U d is the multiplicative group consisting of elements of Z d that have multiplicative inverses in Z d .Elements of U d are the integers less than d that are relatively prime to d. Multiplication by units does not affect divisibility, i.e. if a | b, then ua | b and a | ub for all u ∈ U d .Therefore, the value of gcd(a, b) is not unique, and if g and h are both greatest common divisors of a and b, then there is a unit u ∈ U d such that g = uh, a fact that holds over commutative principal ideal rings [70,71].These observations lead to the ASNF over the ring Z d as stated in Lemma 5.3 in the main text.
Finally, we note that Lemma A.2 can be generalized to matrices over commutative principal ideal rings.A commutative principal ideal ring R is defined by two properties 1. Noetherian property (e.g.Chapter 10 of [46]): all ideals are finitely generated, or, equivalently, any ascending chain of ideals I 0 ⊆ I 1 ⊆ I 2 ⊆ . . .terminates with I n = I m for all m ≥ n.
2. Hermite property [70]: for any two elements a, b ∈ R, there exists c ∈ R and an invertible matrix T ∈ R 2×2 such that (a b)T = (c 0).
These properties generalize key steps in the proof of Lemma A.2.The Hermite property replaces Eq. (31), which is used to zero out matrix entries.Note that it also implies (a b) = (c 0)T −1 and therefore c | a and c | b.The Noetherian property is used to argue that the repeated application of the zeroing subroutines eventually terminates with a new pivotal row and column.Because c | a, the ideal generated by c contains the ideal generated by a. Since the Noetherian property means the chain of ideals must terminate, the algorithm must also terminate.These modifications to the proof of Lemma A.2 lead to a more general theorem.
Theorem A.3.Let R be a commutative principal ideal ring, and suppose A ∈ R k×k is an alternating matrix.Then, there are matrices L, B ∈ R k×k , where B is alternating and has at most one non-zero entry per row and column, and L is invertible, such that A = LBL T .We may further arrange B so that it is non-zero only in the top-left 2r × 2r block which has the form r i=1 0 βi −βi 0 for integers r = Θ(M A )/2, where M A is the R-submodule generated by the columns of A, and each β i ∈ R non-zero, satisfying β i | β i+1 for all i < r.Also, β i is uniquely determined up to multiplication by a unit and satisfies the formula β i d 2i−1 = d 2i (or, alternatively, the formula , where d j is a greatest common divisor in R of all j ×j minors of A (and d 0 := 1).

B Technical lemmas and proofs B.1 Proofs for Section 2
In this section, we give proofs for some of the preliminary lemmas involving modules over commutative rings.Let D ∈ R k×t ′ , with t ′ ≤ t, be a matrix whose columns form a minimal generating set of M Ĉ .Then each ĉi can be expressed as a linear combination of the columns of D; so we can write Ĉ = DE, for some E ∈ R t ′ ×t .Letting Ā := AD ∈ R k×t ′ , we thus have C = ĀE.This shows that Θ(M C ) ≤ t = Θ(M Ĉ ).Again by invertibility of A, we also have Ĉ = A −1 C, and now repeating the argument gives Θ(M Ĉ ) ≤ Θ(M C ). Lemma 2.2.Let C ∈ R k×t be a matrix such that each row and column has at one nonzero element, and suppose one of those non-zero elements is divisible by all the others.Then the minimal number of generators of the R-module M C , generated by the columns of C, is equal to the number of non-zero elements of the matrix C.
Proof.The lemma is clearly true if C = 0; so assume that C has at least one non-zero element.Without loss of generality, we may assume that C is a diagonal matrix so that C 00 , C 11 , . . ., C r−1,r−1 all divide C rr ̸ = 0 (which implies C ii ̸ = 0 for all i = 0, 1, . . ., r), where r + 1 is the number of non-zero elements of C. If C is not of this form, then one can permute the rows and columns of C to bring it to this form, and Lemma 2.1 ensures that the minimal number of generators remain the same.Notice that if x ∈ M C , then for i = 0, 1, . . ., r we have x i = a i C ii for some a i ∈ R, and x i = 0 for all i > r.
Define the ideals N i := {x ∈ R : xC ii = 0} for each i = 0, 1, . . ., r, and note that by the divisibility condition we have N 0 , N 1 , . . ., N r−1 ⊆ N r ̸ = R.Let N be a maximal ideal of R containing N r .Then R/N is a field [46,Chapter 2], and let f : R → R/N be the corresponding quotient map (which is a ring homomorphism) taking an element y ∈ R to its coset in R/N .Treating (R/N ) r+1 as a R-module, we now define a R-module homomorphism κ : M C → (R/N ) r+1 as follows: if x ∈ M C , then (κ(x)) i = f (a i ) for every i = 0, 1, . . ., r, where x i = a i C ii (one can check that κ is well-defined, i.e. it does not depend on the choice of a i as N i ⊆ N , and that it is a module homomorphism).The map κ is also clearly surjective.Thus if M C has a generating set S of size |S| < r + 1, then surjectivity of κ implies that a generating set for (R/N ) r+1 is {κ(x) : x ∈ S}.But then {κ(x) : x ∈ S} is also a generating set of the R/N -module (R/N ) r+1 , which is a vector space of dimension r + 1 as R/N is a field, and thus it cannot have less than r + 1 generators.

B.2 Proofs for Section 4
Recall that in Section 4, we studied a graph G with a vertex for each single-qudit Pauli in P 1 and edges between pairs that do not commute.A subset of vertices W contains those corresponding to Paulis X a Z b with gcd(a, b, d) = 1.
Lemma 4.2.The graph G = (V, E) has the following properties.Proof.To prove (i), we show that if , is also a clique.Suppose for contradiction, there is some (s, t) ∈ C that is connected by an edge to v but not to v ′ .Then, for g = gcd(a, b, d), we have det a/g b/g s t ≡ 0 (mod d) which implies det ( a b s t ) ≡ 0 (mod d), contradicting that (s, t) is connected to v.
To prove (ii), let v i := (a i , b i ) for i = 0, 1, 2. Note that by Bezout's identity, there are integers x, y, z such that Proof.By Bezout's identity, there are integers x 0 , x 1 such that The Chinese remainder theorem implies there is an isomorphism between Z d0d1 and Z d0 × Z d1 .Explicitly, it says that a ≡ a 0 (mod d 0 ) and a ≡ a 1 (mod d 1 ), if and only if a ≡ a 0 x 1 d 1 + a 1 x 0 d 0 (mod d 0 d 1 ).This means that gcd(a, d 0 d 1 ) = gcd(a 0 x 1 d 1 + a 1 x 0 d 0 , d 0 d 1 ).We also have gcd(x, d 0 d 1 ) = gcd(x, d 0 ) gcd(x, d 1 ) for any integer x, as d 0 , d 1 are relatively prime.
Putting these facts together, we complete the proof: = Proof.If J is empty, then d ′ is divisible by d, so the statement is trivial.Thus assume J ̸ = ∅, so d ′ is not divisible by d.It is clear that {x mod d : x ∈ d ′ d ′′ Z} ⊆ {x mod d : x ∈ d ′ Z}.Let ℓ be the smallest non-negative integer such that ℓd ′ ≡ 0 (mod d).Then ℓ must be the form ℓ = j∈J p βj j , where 1 ≤ β j ≤ α j for each j.Note that ℓ is an upper bound on the size of the set {x mod d : x ∈ d ′ Z}.Now consider the multiset of elements K := {td ′ d ′′ mod d : t = 0, . . ., ℓ − 1}.We will show that all elements of K are distinct, which will prove the lemma as it implies that ℓ is a lower bound on the size of the set {x mod d : x ∈ d ′ d ′′ Z}.Suppose this is not the case, and there exist distinct t, t ′ ∈ {0, . . ., ℓ − 1}, with t > t ′ , such that td ′ d ′′ ≡ t ′ d ′ d ′′ (mod d), or equivalently (t − t ′ )d ′ d ′′ ≡ 0 (mod d).By assumptions on J and d ′′ , we then conclude that (t − t ′ )d ′ ≡ 0 (mod d).Since 0 < t − t ′ ≤ ℓ − 1, this implies that ℓ is not the smallest non-negative integer satisfying ℓd ′ ≡ 0 (mod d), which is a contradiction.

B.4 Proofs for Section 6
We start by giving a full proof of the following theorem on transforming generating sets of Pauli groups.
Theorem 6.3.(Equivalent generating sets) Suppose S := {q 0 , q 1 , . . .q k−1 } is an ordered multiset of elements of P n .Let A ∈ Z k×k d be an invertible matrix, and consider the ordered multiset Proof.It is clear that ⟨T ⟩ ⊆ ⟨S⟩, so we only need to show that ⟨S⟩ ⊆ ⟨T ⟩.Since A is invertible, there exists a matrix B ∈ Z k×k d such that BA = I.Now define the multiset U := {r 0 , r 1 , . . ., r k−1 } ⊆ P n , where r ℓ := k−1 j=0 (q ′ j ) B ℓj for each ℓ = 0, 1, . . ., k − 1.Then clearly we have ⟨U ⟩ ⊆ ⟨T ⟩, and we will show that ⟨S⟩ ⊆ ⟨U ⟩.Corresponding to the sets S and U , we also define the sets J S and J U , according to the notation above.

Now fix any ℓ, and note that r
, and this product can be rearranged using group commutators as Notice that the fact BA = I over the ring Z d implies that mod d).Thus using Eq. ( 42) and the definition of J S , we can conclude that r ℓ = λ ℓ q ℓ for some λ ℓ ∈ ⟨J S ⟩, for every ℓ.We can now use Lemma 6.2(i),(iv) to deduce that ⟨J S ⟩ = ⟨J U ⟩ ⊆ ⟨U ⟩.It follows that for every ℓ, we have λ ℓ −1 ∈ ⟨U ⟩, and this implies q ℓ ∈ ⟨U ⟩.We thus conclude that ⟨S⟩ ⊆ ⟨U ⟩, completing the proof.
For part (ii) we only need to prove that I S ⊆ ⟨N S , J S ⟩, as the reverse containment is obvious by part (i) and definition of J S .Take any q ∈ I S .Then one can write q = ℓ j=0 h j , where each h j ∈ S. Using group commutators we can rearrange this product to obtain q = λ k−1 j=0 q rj j , for λ ∈ ⟨J S ⟩, and non-negative integers r j for each j.Using Lemma 6.1, we can further simplify this expression by reducing the powers r j modulo d, to obtain q = λλ ′ k−1 j=0 q sj j , for some λ ′ ∈ ⟨J S ⟩, and each s j ∈ Z d .Define s := (s 0 , s 1 , . . ., s k−1 ) ∈ Z k d .Now we know that π 2 (q) = 0.This then implies that k−1 j=0 s j π 2 (q j ) = 0, evaluated over Z d , or equivalently s ∈ ker(π 2 (S)).We can then conclude that k−1 j=0 q sj j ∈ N S , and thus q ∈ ⟨N S , J S ⟩.We now prove part (iii).Each column of K S is an element of ker(π 2 (S)).Thus it is clear that K S ⊆ N S , and so ⟨K S , J S ⟩ ⊆ ⟨N S , J S ⟩.We will now show that N S ⊆ ⟨K S , J S ⟩, which will prove ⟨N S , J S ⟩ = ⟨K S , J S ⟩, and then by (ii) we will obtain ⟨K S , J S ⟩ = I S .For this, take any v ∈ ker(π 2 (S)).Since the columns of K is a generating set for ker(π 2 (S)), we have v = Kw, for some w ∈ Z ℓ d .Then using the group commutators one obtains for λ, λ ′ , λ ′′ ∈ ⟨J S ⟩, ∈ ⟨K S ⟩ for every i, we deduce that k−1 j=0 q vj j ∈ ⟨K S , J S ⟩, and since v is arbitrary, we conclude that N S ⊆ ⟨K S , J S ⟩.
Next, we prove the corollary of Lemma 6.6.There is a generalization of Lemma 6.14, when the commutator p, q d = c, as appearing in the lemma, is not a unit in Z d .In this case, instead of exact equality, we get lower bounds: Proof.We first prove (i).It is clear that a 1 ≤ a: if not, since p a is equivalent to I up to phase factors, we have p a , q d = 0, which contradicts (c).Moreover, if a 1 does not divide a, i.e. a = ta 1 + s for some integers t ≥ 0 and 0 < s < a 1 , then we also obtain a contradiction to (c) as 0 = p a , q d = p s , q d .Similarly, we can argue that b 1 divides b.Also it is easy to see that a divides d: if this is not true, then d = ta + s for some integers t ≥ 0 and 0 < s < a, which implies p d = (p a ) t p s , and since both p d and p a are equivalent to I up to phase factors, we conclude that the same is true for p s , which contradicts (b).Similarly we may also conclude that b divides d.
Next, suppose that the order of c in Z d is 0 < γ < a 1 .Thus cγ mod d = 0, which implies p γ , q d = cγ mod d = 0, contradicting (c).Thus the order of c in Z d is at least a 1 .On the other hand, ca 1 mod d = p a1 , q d = 0, implying the order of c is at most a 1 .From this we conclude that the order of c is a 1 .By repeating the same argument using the fact that b 1 be the smallest positive integer such that p, q b1 d = 0, we also get that the order of c equals b 1 , and thus a 1 = b 1 .Now we prove (ii).All quotient groups appearing in the following argument will be treated as subgroups of P n /K.Define the sets P = {p j : j = 0, 1, . . ., a−1}, and Q = {q j : j = 0, 1, . . ., b−1}.By (b) we know that the elements in the set P (or Q) are all distinct, even up to phase factors.This implies that the quotient groups ⟨p⟩/I {p} and ⟨q⟩/I {q} are in one-to-one correspondence with the sets P and Q respectively, where the identification is made by taking equivalence up to phase factors.Next, we make two claims, which we prove at the end: Claim B: Let H ⊆ P n is a subgroup such that q commutes with every element of H. Define H P := {p j ∈ P : ω t p j ∈ H, for some t ∈ Z d }.Then H P ⊆ {p j : 0 ≤ j ≤ a − 1, j ≡ 0 (mod a 1 )}.
Consider the group ⟨G⟩, each of whose elements commute with q (resp.p) by (a).Then it follows from Claim B and Claim A respectively, that This now implies For the bound on ⟨G, p, q⟩/I G∪{p,q} , consider the subgroup generated by G := G∪{p}.Then every element of ⟨G⟩ commutes with p. Again by Claim A, we have that ⟨G⟩/I G ∩ ⟨q⟩/I {q} ≤ b/a 1 , and then it follows that where we used Eq. ( 45) in the last inequality.
We now prove Claim A, and then proof of Claim B is exactly similar.For contradiction, assume that there exists q j ∈ H Q such that j ̸ ≡ 0 (mod a 1 ).Then 0 = p, q j d = cj mod d, as q j is equivalent to some element of H up to phase factors.But this implies that j must be a of a 1 , as a 1 is the order of c (specifically order implies that for any positive integer 0 < γ < a 1 , cγ mod d ̸ = 0).Thus we have a contradiction, and the claim is proved.Proof.For part (i), suppose that v ∈ π 2 (⟨S⟩).This means that there exists p ∈ ⟨S⟩ such that π 2 (p) = v.Now one can write p = ℓ j=0 h j , where each h j ∈ S, which upon rearrangement of the order of the product using group commutators and simplifying the result using s d j = ±I for each j (by Lemma 6.1), gives p = λ For part (ii), let the ordered multiset T := {q 0 , q 1 , . . ., q ℓ−1 } ⊆ ⟨S⟩ be a generating set of ⟨S⟩, and let M ′ be the module generated by the columns of π 2 (T ) ∈ Z 2n×ℓ d .Take v ∈ M .Then by part (i), v ∈ π 2 (⟨S⟩).Since ⟨T ⟩ = ⟨S⟩ by assumption, we again get v ∈ M ′ by part (i).Thus M ⊆ M ′ , and running the argument backwards gives M ′ ⊆ M .Note that by Lemma 2.4, the minimal number of generators of M is Θ(M ) = r.Part (ii) implies that the columns of π 2 (T ) generate the submodule M , and so by Corollary 6.7, π 2 (M ) has r invariant factors.Also, if |{v ∈ T : π 2 (v) ̸ = 0}| < r, then this leads to a contradiction as it implies that the minimal number of generators of M is less than r.
Part (iv) follows because S is a generating set for ⟨S⟩ of size k, and part (iii) implies that any generating set of ⟨S⟩ must have size k.
For part (v), if r = k, we can take T ′ = S and any p ∈ I S such that ⟨p⟩ = I S .So assume r < k, and let us denote Q := Q ( D 0 0 0 ) = Q 0 , where Q ∈ Z 2n×r d with no non-zero columns and all columns distinct (by assumption on invertibility of Q, and uniqueness of Smith normal form).We define the ordered multiset U := {u 0 , u 1 , . . ., u k−1 } ⊆ ⟨S⟩, with u i := k−1 j=0 s Pji j .The equation π 2 (S)P = Q ( D 0 0 0 ) then implies π 2 (u i ) equals the i th column of Q for every i, from which we deduce that for every i = r, r + 1, . . ., k − 1, u i = ω µi I ∈ I S for some µ i ∈ Z d .Now Theorem 6.3 implies that ⟨S⟩ = ⟨U ⟩, as P is invertible.Also Lemma 6.5(ii) implies that there exists µ ∈ Z d such that ⟨ω µ I⟩ = I S .Combining these facts we deduce that ⟨S⟩ = ⟨u 0 , u 1 , . . ., u r−1 , ω µ I⟩.Thus we may choose T ′ = {u 0 , u 1 , . . ., u r−1 } and p = ω µ I.We now prove properties (a)-(c) for any such generating set.Note that the columns of π 2 (T ) and π 2 (T ′ ) both generate the same submodule.
Then property (a) follows by part (ii), while property (b) follows by part (iii) and Corollary 6.7.Property (c) because if π 2 (T ′ ) had two columns which are the same or if one column was zero, then the number of invariant factors of π 2 (T ′ ) (which equals the minimal number of generators for the submodule generated by the columns of T ′ ) would be less than r, which would contradict property (b).
Finally, we present the proof of Lemma 6.10.Lemma 6.10.Given S ⊆ P n , suppose that T := T ′ ∪ {p} is a near-minimal generating set of ⟨S⟩ with ⟨p⟩ = I S .Let d be prime.Then the following conditions are equivalent.
(i) T is a minimal generating set of ⟨S⟩.
(ii) ⟨T ′ ⟩ is a stabilizer subgroup of P n , and I S ̸ = {I}.
Proof.Let the number of invariant factors of π 2 (S) be r ≥ 1, and suppose T ′ = {q 0 , q 1 , . . ., q r−1 }.We first note a few facts about T ′ .Since d is prime, Z d is a field.By Lemma B.4(v), we know that π 2 (T ′ ) has r ≥ 1 invariant factors, and since |T ′ | = r, this implies that the kernel of the matrix π 2 (T ′ ) is trivial (as Z d is a field).Now let T ′′ := {p γ0 q 0 , p γ1 q 1 , . . ., p γr−1 q r−1 }, for some (γ 0 , γ 1 , . . ., γ r−1 ) ∈ Z r d .Then by Lemma 6.2(iv), we know that ⟨J T ′ ⟩ = ⟨J T ′′ ⟩.Also since π 2 (T ′ ) = π 2 (T ′′ ), we conclude that the kernel of π 2 (T ′′ ) is trivial.Combining these observations, and by using Lemma 6.4(iii) we deduce that I T ′ = I T ′′ .Let us call this Fact (a).The other fact we need is that, since d is prime, ⟨ω j I⟩ = {ω ℓ I : ℓ ∈ Z d }, for every j ∈ Z d \ {0}.Let us call this Fact (b).We now return to the proof.First we prove that (i) implies (ii).So assume that T is a minimal generating set of ⟨S⟩.For contradiction, assume that I S = {I}, which then implies p = I.Hence T cannot be a minimal generating set as ⟨T ′ ⟩ = ⟨T ′ , p⟩.Thus I S ̸ = {I}.Next for contradiction again assume that ⟨T ′ ⟩ is not a stabilizer group.This means that there exists ω j I ∈ ⟨T ′ ⟩, for some j ∈ Z d \ {0}, and then Fact (b) implies that p ∈ ⟨T ′ ⟩.Thus again we conclude that T is not a minimal generating set of ⟨S⟩.
Next we prove that (ii) implies (i).So assume that T ′ is a stabilizer group and I S ̸ = {I}, which implies that I T ′ = {I} and p ̸ = I, respectively.For contradiction, assume that T is not a minimal generating set of ⟨S⟩.Then by Theorem 6.9(iii), we can conclude that there exist integers γ 0 , γ 1 , . . ., γ r−1 ∈ Z d , such that p ∈ ⟨T ′′ ⟩, where T ′′ := {p γ0 q 0 , p γ1 q 1 , . . ., p γr−1 q r−1 }.Now by Fact (a) we also have that I T ′′ = I T ′ = {I}.Hence we can conclude that p = I, giving a contradiction.This proves the lemma.

C Maximum collections of non-commuting pairs
While subsection 5.1 gave both necessary and sufficient conditions for a k-tuple (f 0 , . . ., f k−1 ) ∈ (Z d \ {0}) k to be an achievable non-commuting pair relation on n qudits, it is possible to give a more direct characterization of which k-tuples are achievable non-commuting pair relations for the case k = nm, i.e. when we have the maximum number of non-commuting pairs on n qudits.
To simplify the presentation, let us introduce the following notation.
Given f i ∈ Z d \ {0} as in the above notation, it is useful to explicitly write down for which values of β i ∈ Z d \ {0}, we have β i f i ̸ ≡ 0 (mod d).This is easy to calculate.We start by writing f i = h gcd(f i , d), the gcd evaluated over integers, so that we have gcd(h, d) = 1.Now consider the set δ i := {td/ gcd(f i , d) : t ∈ Z, 1 ≤ t gcd(f i , d)}.Then we claim that β i f i ̸ ≡ 0 (mod d) if and only if β i ̸ ∈ δ i .The forward direction is easy (we prove the contrapositive): if β i ∈ δ i , then we have β i f i = htd for some integer t, and so β i f i ≡ 0 (mod d).For the converse direction, assume for contradiction that β i f i ≡ 0 (mod d), and β i ∈ (Z d \ {0}) \ δ i .This means that d divides h gcd(f i , d)β i , and since gcd(h, d) = 1 this implies that β i is a multiple of d/ gcd(f i , d), which is a contradiction.
With the notation above, we state a helper lemma: Lemma C.1.If f := (f 0 , . . ., f k−1 ) is an achievable non-commuting pair relation on n qudits, then (i) f σ is an achievable non-commuting pair relation on n qudits, for every σ ∈ Sym(k).

(
ℓ+k)×(ℓ+k) d , such that AL = H, where H ∈ Z k×(k+ℓ) d is a matrix in Howell normal form.Moreover, the matrix H is uniquely determined.

Lemma 4 . 2 .
m | = Ψ(d), thus proving Theorem 4.1.The first step is to define a special subset of vertices W ⊆ V , where (a, b) ∈ W if and only if gcd(a, b, d) = 1.We prove the following Lemma in Appendix B.2.The graph G = (V, E) has the following properties.
and only if there exists u ∈ Z with gcd(u, d) = 1 such that (a, b) = (us, ut), the right hand side evaluated modulo d.Lemma 4.2(i) implies there is a maximum clique that is just a subset of W . Part (ii) implies transitivity of an equivalence relation u ∼ v for u, v ∈ W , where u and v are said to be equivalent if (u, v) ̸ ∈ E. Part (iii) says equivalence classes of W are all the same size, exactly the size of the group of units of Z d , which is ϕ(d) = d m−1 j=0 (1 − 1/p j ), the Euler's totient function.Combining these facts, the size of a maximum clique of G is |C m | = |W |/ϕ(d).Thus the remaining step is to count |W |.

Theorem 5 . 4 (
Minimum qudits achieving commutation relations).If C ∈ Z k×k d is an alternating matrix, and Q ∈ Z k×2n d satisfies QΛQ T = C, then n ≥ Θ(M C )/2, where M C is the submodule generated by the columns of C.Moreover, there exists a matrix P ∈ Z k×Θ(M C ) d , such that C = P ΛP T .Rows of P indicate k (Θ(M C )/2)-qudit Paulis possessing the commutations relations specified by C. For some particular forms of alternating matrix C ∈ Z k×k d , one can compute Θ(M C ) analytically.We mentioned one such case already before in Lemma 2.2.Another case happens when all entries in the upper triangular part of C, except the diagonal, are equal.That is, suppose C is alternating with C ij = t ∈ Z d \ {0}, for all j > i.Then applying Lemma 5.3 again gives matrices L, B ∈ Z k×k d , with L invertible and B alternating and of the form given by the lemma.Let us calculate the quantities d 0 , d 1 , . . ., d k as defined in Lemma 5.3.First suppose t = 1.Then we have: (i) d 0 = 1 by definition.

Lemma 6 . 6 .
Let A, B ∈ Z k×ℓ d .Then the following conditions are equivalent.(i) The submodules of Z k d generated by the columns of A and B are the same.(ii) There exists an invertible matrix C ∈ Z ℓ×ℓ d such that A = BC.Also note the following simple corollary of this lemma, which we prove for completeness in Appendix B.4. Corollary 6.7.Let A ∈ Z k×p d and B ∈ Z k×q d .If the columns of A and B generate the same submodule of Z k d , then the number of invariant factors of A and B are equal.

Lemma 6 . 8 (
The size of a smallest generating set).For any ordered multiset S ⊆ P n , if π 2 (S) ∈ Z 2n×k d has r invariant factors in its Smith normal form, then the smallest generating set of ⟨S⟩ has either r or r + 1 elements.

Example 3 .
Suppose S = {X, ωI} ⊆ P n , for any d > 2 and number of qudits n.Then one checks easily that ⟨S⟩ = {ω j X a : j, a ∈ Z d }.Thus |⟨S⟩| = d 2 .Computing the Smith normal form of π 2 (S)

Lemma 6 . 11 (
Gram-Schmidt generators).Let S = 0 , s 1 , . . ., s k−1 } ⊆ P n and letA ij = s i , s j d define a matrix A ∈ Z k×k d .The group G := ⟨S⟩ has a Gram-Schmidt generating set S 1 ∪ S 2 ∪ U where |S 1 | = |S 2 | = Θ(M A )/2, M A is the submodule of Z k d generatedby the columns of A, and moreover there does not exist a Gram-Schmidt generating set with smaller S 1 or S 2 .Proof.Constructing a Gram-Schmidt generating set makes use of the alternating Smith normal form from Lemma 5.3.Denote by Q ∈ Z k×2n d the matrix whose rows correspond to each Pauli in S, ignoring the phase factors.Then A = QΛQ T ∈ Z k×k d .Now Lemma 5.3 implies the existence of L, B ∈ Z k×k d such that A = LBL T , where L is invertible and B is of the form specified by the lemma, whose top-left block has the form r i=1 0 βi −βi 0 for non-zero β 1 , β 2 , . . ., β r , and all other elements are zero.Here r 2 (S) have a Smith normal form D ∈ Z 2n×k d , a diagonal matrix satisfying π 2 (S)P = QD, where P ∈ Z k×k d and Q ∈ Z 2n×2n d are invertible, and the first r (and only) non-zero diagonal entries of D are d 0 , d 1 , . . ., d r−1 .Let the first r columns of Q be Q 0 , Q 1 , . . ., Q r−1 .Then the submodule of Z 2n d generated by the columns of π 2 (S) equals that generated by the set {d i Q i : i = 0, 1, . . ., r − 1}; call this submodule M .It is then clear that |⟨S⟩| = |I S | |⟨S⟩/I S | = |I S | |M |.

Theorem 6 . 15 (
The square-free theorem).Suppose d = p 0 p 1 . .p m−1 is square-free.If S, T is a collection of nm non-commuting pairs, then ⟨S, T ⟩ = P n .

Figure 1 :
Figure 1: A schematic of the matrix Bi−1 at the start of iteration i of the Smith normal form construction.Non-shaded regions are guaranteed to be zero except for at most 2(i − 1) existing pivotal entries.Dark yellow squares indicate the entries that iteration i is making into pivotal entries.Yellow-shaded regions contain entries that must be zeroed-out to complete iteration i.
where α is the row index of a and β the row index of b.Then B ′ ← KB ′ K T has replaced entry a with gcd(a, b) and b with 0, while also preserving the antisymmetry of B ′ (since we likewise took linear combinations of columns by acting via K T on the right).

Algorithm 3 1 :
k×k d since det(L) = det( L) mod d = ±1.The matrix B is alternating in Z k×k d since B was alternating in Z k×k .Moreover, any zero entry in B remains zero in B, while some non-zero entries of B may become zero in B, due to the modulo d operation.The divisibility and uniqueness parts of Lemma A.2 are also modified in accordance with the following observations on divisibility in Z d .Given alternating matrix A ∈ Z k×k , find L, B ∈ Z k×k such that = LBL T and B, L have the properties stated in Lemma A.2. procedure Alternating Smith Normal Form(A) 2:

Lemma 2 . 1 .
Suppose C ∈ R k×t and A ∈ R k×k , B ∈ R t×t are invertible matrices.Let C := ACB.The minimal number of generators of the submodules M C and M C of R k are equal.Proof.Let Ĉ := CB, let its columns be ĉ0 , ĉ1 , . . ., ĉt−1 , and let M Ĉ be the submodule of R k generated by the columns of Ĉ.For every i, each column ĉi ∈ M C is a linear combination of the columns of C, from which we conclude that M Ĉ ⊆ M C .By invertibility of B, we also have C = ĈB −1 , and by the same argument we conclude that M C ⊆ M Ĉ .Thus M C = M Ĉ , and so we have Θ(M C ) = Θ(M Ĉ ).It remains to show that Θ(M C ) = Θ(M Ĉ ), where C = A Ĉ.

(
iii) If (a, b) and (s, t) are in W , then ((a, b), (s, t)) ̸ ∈ E if and only if there exists u ∈ Z with gcd(u, d) = 1 such that (a, b) = (us, ut), the right hand side evaluated modulo d.

)
Due to the last column of A being a linear combination of the first two and (d, d, d) T , we have det(A) ≡ 0 (mod d).Also, for c i = xa i + yb i + zd, the cofactor expansion gives det(A) = c 0 det a1 b1 a2 b2 − det a0 b0 a2 b2 + c 2 det a0 b0 a1 b1 ≡ − det a0 b0 a2 b2(mod d).Thus we have det a0 b0 a2 b2 ≡ 0(mod d), proving (v 0 , v 2 ) ̸ ∈ E.The reverse direction of part (iii) is a simple calculation.The forward direction is more involved.Apply the Smith form (Theorem 2.3) to obtain A := ( a b s t ) = S −1 BT for invertible matrices S, T ∈ Z 2×2 d and diagonal B ∈ Z 2×2 d with B 00 = gcd(a, b, s, t) mod d = gcd(a, b, s, t) and B 11 = (det(A)/B 00 ) mod d.The formulas for B 00 and B 11 follow from [72, Theorem 2.4], even though Z d is not a unique factorization domain (as required to apply the theorem), but one may simply compute the Smith normal form of A over Z, which is a unique factorization domain, and then reduce the resulting matrices modulo d, leading to the formulas for B 00 and B 11 .Since ((a, b), (s, t)) ̸ ∈ E, we know det(A) ≡ 0 (mod d), and because we also have gcd(a, b, d) = 1, we get B 11 = 0. Rearrange to obtain SA = BT and note that the last row of BT is (0, 0).This implies S 10 (a, b) + S 11 (s, t) = (0, 0),

Corollary 6 . 7 .
Let A ∈ Z k×p d and B ∈ Z k×q d .If the columns of A and B generate the same submodule of Z k d , then the number of invariant factors of A and B are equal.Proof.Without loss of generality, assume that p ≤ q.Define A = A 0 ∈ Z k×q d , where we have added q − p zero columns to A. Then the columns of A and B still generate the same submodule of Z k d .Thus by Lemma 6.6 we may conclude that there exists an invertible matrix C ∈ Z q×q d such that A = BC.Next, suppose that A has the Smith normal form AP = QD where P, Q are invertible matrices, and D is a diagonal matrix (all matrices are over the ring Z d ).Let D have r non-zero elements.Then we can note that A ( P 0 0 I ) = Q D 0 , or equivalently BC ( P 0 0 I ) = Q D 0 , and since ( P 0 0 I ) is invertible, we can conclude by the uniqueness part of the Smith normal form (Theorem 2.3) that D 0 is the Smith normal form of B, and hence B also has r invariant factors.
and assume the following: (a) Each element in G commutes with both p and q.(b) a, b are the smallest positive integers such that p a and q b are equivalent to I up to phase.(c) a 1 , b 1 are the smallest positive integers such that p a1 , q d = p, q b1 d = 0.Then the following statements are true:(i) a 1 = b 1 , a 1 dividesboth a b, and both a, b divide d, where division is over integers.Moreover, the order of c in Z d is a 1 .(ii) We have the lower bounds ⟨G, p⟩/I G∪{p} ≥ a 1 |⟨G⟩/I G |, ⟨G, q⟩/I G∪{q} ≥ a 1 |⟨G⟩/I G |, and ⟨G, p, q⟩/I G∪{p,q} ≥ a 2 1 |⟨G⟩/I G |.

Lemma 6 .Lemma B. 4 .
8 follows from parts (iii)-(v) of the next lemma.Several parts of this stronger lemma are used in subsequent proofs in Section 6.1.Let π 2 (S) ∈ Z 2n×k d have the Smith normal form ( D 0 0 0 ), so that π 2 (S)P = Q ( D 0 0 0 ), for D ∈ Z r×r d a diagonal matrix with all diagonal entries non-zero, r ≥ 1, and invertible matrices P ∈ Z k×k d , Q ∈ Z 2n×2n d .Let M denote the submodule of Z 2n d generated by the columns of π 2 (S).Then we have the following: (i) v ∈ π 2 (⟨S⟩) if and only if v ∈ M , where π 2 (⟨S⟩) can be regarded as a multiset of size |⟨S⟩|.(ii) If T ⊆ ⟨S⟩ is a generating set of ⟨S⟩, then the submodule of Z 2n d generated by the columns of π 2 (T ) equals M .(iii) If T ⊆ ⟨S⟩ is a generating set of ⟨S⟩, then T contains a subset T ′ , with |T ′ | ≥ r, such that u ∈ T ′ implies π 2 (u) ̸ = 0.Moreover, the number of invariant factors of π 2 (T ) is r.(iv) If r = k, then S is a generating set of ⟨S⟩ of the smallest size.(v) If r ≤ k, there exists a generating set T = T ′ ∪ {p} of ⟨S⟩ such that ⟨p⟩ = I S and |T ′ | = r.Moreover, any such generating set has the following properties: (a) the columns of π 2 (T ′ ) generate the submodule M , (b) the matrix π 2 (T ′ ) has r invariant factors, and (c) for distinct elements q, r ∈ T ′ , π 2 (q) and π 2 (r) are distinct and non-zero.

Lemma 2.1. The
[47,nd B ∈ R t×t are invertible matrices, and recall that a square matrix with entries in R is invertible if and only if the determinant of the matrix is a unit in R[47, Corollary 2.21].Let C := ACB ∈ R k×t , let its columns be c0 , c1 , . . ., ct−1 , and let M C be the submodule of R k generated by the columns of C. The important property of the minimal number of generators that we will need is the following lemma, which is well-known (although we also provide a proof in Appendix B.1): minimal number of generators of the submodules M C and M C of R k are equal.
evaluated over Z d .This equation means gcd(S 10 , d) divides S 11 s and S 11 t over integers.Because gcd(s, t, d) = 1, it must then be that gcd(S 10 , d) also divides S 11 over integers.Therefore, gcd(S 10 , d) divides det(S) over integers, and hence also over Z d .Since S is invertible, det(S) (modulo d) is a unit of the ring Z d , so its only divisors are other units.This means gcd(S 10 , d) = 1.Likewise, we can argue gcd(S 11 , d) = 1.So S 10 , S 11 , and S 11 (S 10 ) −1 are all units in Z d .Multiply Eq. (37) by S −1 10 and rearrange to finish the proof.The other lemma we prove here involves the arithmetic function H(d) = |W |/d.We argued in Section 4 that it is equal to