Cost-optimal single qubit gate synthesis in the Clifford hierarchy

For universal quantum computation, a major challenge to overcome for practical implementation is the large amount of resources required for fault-tolerant quantum information processing. An important aspect is implementing arbitrary unitary operators built from logical gates within the quantum error correction code. A synthesis algorithm can be used to approximate any unitary gate up to arbitrary precision by assembling sequences of logical gates chosen from a small set of universal gates, which are fault-tolerantly performable while encoded in a quantum error-correction code. However, current procedures do not yet support individual assignment of base gate cost values and many do not support extended sets of universal base gates. We study cost-optimal sequences synthesised from sets of base gates which include Clifford gates and $Z$-rotation gates from higher orders of the Clifford hierarchy, which can be performed fault-tolerantly on error-correction codes using magic state distillation protocols. The individual costs assigned are the average numbers of raw (i.e. physical level) magic states required to implement the gates fault-tolerantly. By including the $Z$-rotation gates from the fourth order of the Clifford hierarchy as base gates in addition to the canonical Clifford+$T$ gates, we find that the average cost decreases by up to $30\%$. The gate synthesis algorithm introduced in this work, based on Dijkstra's algorithm, generates cost-optimal sequences for single-qubit target gates and supports arbitrary universal sets of single-qubit base gates with individually assigned cost values. In addition, we develop an analytic model to estimate the proportion of sets of $Z$-rotation gates from higher orders of the Clifford hierarchy among gates within sequences approximating random target gates, which can be used to estimate each order's effectiveness for the purpose of gate synthesis.

For universal quantum computation, a major challenge to overcome for practical implementation is the large amount of resources required for fault-tolerant quantum information processing. An important aspect is implementing arbitrary unitary operators built from logical gates within the quantum error correction code. A synthesis algorithm can be used to approximate any unitary gate up to arbitrary precision by assembling sequences of logical gates chosen from a small set of universal gates that are fault-tolerantly performable while encoded in a quantum error-correction code. However, current procedures do not yet support individual assignment of base gate costs and many do not support extended sets of universal base gates. We analysed cost-optimal sequences using an exhaustive search based on Dijkstra's pathfinding algorithm for the canonical Clifford+T set of base gates and compared them to when additionally including Z-rotations from higher orders of the Clifford hierarchy. Two approaches of assigning base gate costs were used. First, costs were reduced to T -counts by recursively applying a Z-rotation catalyst circuit. Second, costs were assigned as the average numbers of raw (i.e. physical level) magic states required to directly distil and implement the gates fault-tolerantly. We found that the average sequence cost decreases by up to 54±3% when using the Z-rotation catalyst circuit approach and by up to 33±2% when using the magic state distillation approach. In addition, we investigated observed limitations of certain assignments of base gate costs by developing an analytic model to estimate the proportion of sets of Z-rotation gates from higher orders of the Clifford hierarchy that are found within sequences approximating random target gates.

Introduction
Quantum computing has the potential to solve many real-world problems by using significantly fewer physical resources and computation time than the best known classical algorithms. The quantum algorithms for these problems are implemented using deep quantum circuits. Thus to reliably implement these circuits, qubits within the devices require long coherence times and high precision control. Current systems consist of physical qubits that are too noisy for large scale computation. Error-correction schemes provide the ability to overcome this hurdle by entangling clusters of physical qubits in such a way that they collectively encode the information into more robust logical qubits. In principle, when physical qubits have error-rates below the error threshold of the error-correction scheme, logical qubits within the code can be made arbitrarily robust using increasing numbers of qubits. A particular error-correction scheme with relatively high physical error threshold of approximately 1% is the surface code, which is implemented over a nearestneighbour two-dimensional physical layout, making it one of the most realistically implementable schemes [1][2][3][4]. In this work, we analyse the resource costs for gate synthesis, which is used to fault-tolerantly implement arbitrary unitary gates in error-correction codes.
The surface code, among other high-threshold codes, is limited to a small set of Clifford gates over logical qubits that can be performed with relative ease. A procedure called magic state distillation can be used to perform a wider range of non-Clifford gates fault-tolerantly, such as the T := R z (π/4) gate (up to global phase), which cannot be produced using only Clifford gates [5,6]. Initially, raw magic states are surgically injected into the code and with the aid of state distillation procedures, a number of raw magic states are consumed to produce a smaller number of more robust magic states. In principle, the procedures can be recursively applied to obtain states with arbitrarily low noise, although requiring large amounts of physical resources. These purified magic states can then be consumed to fault-tolerantly perform corresponding gates using quantum teleportation circuits. Distillation procedures only exist for a subset of gates, in order to implement arbitrary unitary gates, the Solovay-Kitaev (SK) theorem can be used. The SK theorem states that a universal set of n-qubit gates generate a group dense in SU (2 n ) (Special Unitary), and the set fills SU (2 n ) relatively quickly. Hence single-qubit base gates that form a universal set can be multiplied in sequence to approximate any single-qubit gate to arbitrary precision [7,8].
A frequently used set of single-qubit universal base gates for fault-tolerant quantum computation are the Clifford+T gates, where the Clifford gates are relatively cheap to apply while the T gate requires a considerable amount of resources due to the magic state distillation procedure. This set of gates and how they can be used to synthesise arbitrary single-qubit gates is a well studied topic within the quantum compilation literature. Gate synthesis algorithms, besides brute-force [9], began with the Solovay-Kitaev algorithm [8,10]. It initially searches for a base sequence that roughly approximates a target gate and then uses a recursive strategy to append other base sequences in such a way that the new sequence approximates a gate that is closer to the target gate with distance reducing efficiently with the number of iterations. It is compatible with arbitrary single-qubit universal gate sets, provided that they include each gate's adjoint. The SK algorithm has room for optimisation with respect to lengths of resulting gate sequences since the recursive process generates strings of disjoint subsequences which are only individually optimised, rather than optimising over the entire sequence. In 2008, Matsumoto and Amano [11] developed a normal form for sequences of Clifford+T gates that produces unique elements in SU (2). Shortly after, Bocharov and Svore [12] introduced their canonical form which extends the normal form by instead producing unique elements in P SU (2) (Projective Special Unitary) which more concisely describes the space of all physical single-qubit gates by ignoring global phase. This normal form can be used to enumerate length optimal sequences of Clifford+T base gates which produce distinct gates, considerably reducing the size of the sequence configuration space for search algorithms (although still growing exponentially with respect to sequence length).
More recently, there has been significant progress on developing direct synthesis methods which are not based on search. For target single-qubit unitary gates that can be exactly produced by Clifford+T base gate sequences, a method was developed that optimally and efficiently finds these exact sequences directly [13]. This was later used as a subroutine in algorithms for optimal synthesis of arbitrary single-qubit Z-rotations [14,15]. Direct Clifford+T base gate synthesis methods for Z-rotations have since been generalised to Clifford+cyclotomic (Z-rotation by π/n) sets of base gates [16] and sets derived from totally definite quaternion algebras [17]. For arbitrary single-qubit rotations (not necessarily Z-rotations) there has been a number of other approaches developed, such as a randomised algorithm that uses the distribution of primes [18], asymptotically optimal synthesis using ancilla qubits [19], and probabilistic quantum circuits with fallback [20].
It is common within the quantum compilation literature for synthesis algorithms to optimise sequences based on minimising the total number of gates that require magic state injection. This measure is well-suited to the Clifford+T set of base gates which are standard for gate synthesis algorithms, since the T gate and its adjoint are the only gates with a significantly higher cost than the Clifford gates. However, procedures exist for performing alternative gates to the T gate that vary in implementation cost. Examples of such gates are found within the Clifford hierarchy, which is an infinite discrete set of gates that are universal and can be performed on certain error-correcting codes fault-tolerantly [21]. The resource cost of implementation typically varies between orders of the hierarchy. Thus to accurately cost optimise sequences from such sets of gates, the cost of each individual base gate should be considered. We investigate two different approaches for implementing Z-rotation gates from the Clifford hierarchy and calculating their resource costs. The first approach is based on a circuit that uses a catalyst Z-rotation state to implement two copies of its corresponding Z-rotation gate using a small number of T gates while retaining the initial Z-rotation state [22,23]. This circuit can enable the average resource costs of implementing Z-rotation gates from the Clifford hierarchy to be expressed as T -counts. Using this approach, costs could be calculated either by assuming that output gates are applied directly to target qubits or by assuming that all output gates are first applied to |+ states to form intermediate magic states, which can then be consumed to implement the corresponding gates onto target qubits at any time. As an alternative to the Z-rotation catalyst circuit approach of gate implementation, the second approach is to use the average number of raw magic states required to directly distil and implement subsets of gates belonging to the Clifford hierarchy in surface codes. The distillation costs have already been calculated by Campbell and O'Gorman [24] for various levels of precision, the accumulated costs of distilling and then implementing the gates are found within their supplementary materials. Although other factors relating to physical resources are important to consider such as qubit count, circuit depth, magic state distillation methods, and details of the error-correction implementation, the number of raw magic states can serve as a rough approximation to the cost of implementing fault-tolerant logical gates on surface codes.
We introduce an algorithm, based on Dijkstra's shortest path algorithm, that generates a database of all cost-optimal sequences below a chosen maximum sequence cost where each sequence produces distinct gates in P SU (2). The algorithm supports arbitrary universal sets of single-qubit base gates with individually assigned cost values. The database can then be searched to find a sequence approximating a specified target gate. We use this algorithm to compare the cost of costoptimal gate synthesis between the canonical Clifford+T base gate set and various sets of base gates consisting of Clifford gates and Z-rotations from higher orders of the Clifford hierarchy. Each set of logical base gates is compared by calculating how the average gate sequence cost for approximating random target gates scales with respect to reaching target gate synthesis logical error rates. When including Z-rotation base gates from higher orders of the Clifford hierarchy with T -counts assigned using the Z-rotation catalyst approach, we find that the average cost-optimal sequence T -counts can potentially be reduced by over 50% when output gates are directly applied to target qubits and by over 30% when intermediate magic states are used. When using the alternative approach of assigning costs from direct magic state distillation, we find that by including Z-rotation logical base gates from the fourth order of the Clifford hierarchy, the average cost-optimal sequence costs can be reduced by 30%. These cost reductions indicate that a significant amount of resources could be saved by adapting current synthesis algorithms to include higher orders of the Clifford hierarchy and to optimise sequences with respect to individual gate costs.
In the cases when costs are assigned using the Z-rotation catalyst method via intermediate magic states or when assigned using direct magic state distillation, we observe that there is only a small improvement to the average costs of synthesis when Z-rotations of orders higher than four of the Clifford hierarchy are included as base gates. We investigate this behaviour by developing a model to estimate the proportion of Z-rotation base gates from specified orders of the Clifford hierarchy within sequences approximating random target gates, without needing to generate the database of sequences. The proportions calculated in this manner closely fit results obtained using the sequence generation algorithm to approximate uniformly distributed random target gates. The parameters of the calculation include the maximum sequence cost and separate logical base gate costs for each order of the Clifford hierarchy, which can be readily be extended to specify costs for individual logical base gates.

Base Gates From The Clifford Hierarchy
The Clifford hierarchy is an infinite discrete set of gates that are universal for the purposes of quantum computation and can be fault-tolerantly performed on certain error-correcting codes. Each order of the hierarchy is defined as noting that C 1 = P is the set of Pauli gates, C 2 is the set of Clifford gates and C 3 includes, among others, the Pauli basis rotations by π/4 such as the T gate. Higher order gates typically correspond Figure 1: A Z-rotation catalyst circuit [22,23]. The rotations Rz(2πk/2 n ) are elements of Tn (as shown in Eq. 2) where k is an odd integer and n is a natural number. The circuit utilises a |Tn state, a |T state, three T3 gates and a Tn−1 gate to perform two Tn gates on two separate qubits while retaining the original |Tn state. The output Tn gates can either be applied directly to target qubits or |ψ0 and |ψ1 states can be first set to |+ states, so that the application of the Tn gates prepare two |Tn states which can then be used to implement Tn gates at any time and on any target qubit using teleportation circuits. However, this consumes on average an additional half a Tn−1 gate for the implementation of each Tn gate. The two sets of grouped gates (outlined by dashed lines) correspond to logical-AND computation and uncomputation circuits, which only requires a total T -count of four to implement [23]. The circuit can be recursively applied until the Rz(2πk/2 n−1 ) gate position reduces down to a T3 gate which has a cost of 1. All costs are calculated by assuming that all target gates at each recursive level of the circuit are used at some point (i.e. that no output gates are wasted).
to finer angle rotations.
In this work, we compare sets of single-qubit universal logical base gates consisting of Clifford gates and Z-rotation gates from higher orders of the Clifford hierarchy. Although only higher order Z-rotations are included, they can be readily converted to other gates in the same order of the Clifford hierarchy by multiplying gates from lower orders. In particular, by multiplying Clifford gates, other gates of the same order are generated for the same cost. For example Z.R z (π/4) = R z (5π/4) and H.R z (π/4).H = R x (π/4) up to global phase, where H is the Hadamard gate and Z is the Pauli-Z gate. These sets of logical base gates are compared with respect to the optimal resource costs resulting from gate synthesis for random target gates. Each set of Z-rotation gates from order 3 ≤ l ≤ 7 of the Clifford hierarchy, denoted T l , can be written as . . , 13, 15} , and The five sets of logical base gates used in our analysis are then constructed as Set 4 := Set 3 ∪ T 6 , and Calculating precise resource costs of implementing each gate fault-tolerantly is an extensive task that would need to consider a variety of factors such as qubit count, circuit depth, magic state distillation methods and details of the error-correction implementation. As an approximation for the cost of these logical gates we investigate two approaches of assigning costs to individual T l gates, where gates from C 1 and C 2 are assumed to be free since they can be implemented in a relatively straightforward way. The first approach can associate the costs with the T -count, which Average T -count per base gate Average T -count per base gate Table 1: The average number of T gates required to implement a single qubit Z-rotation gate from order l of the Clifford hierarchy T l using the Z-rotation catalyst approach.   Table 2: The average raw magic state count required for distillation and implementation of corresponding logical base gates, obtained from the supplementary materials of [24]. Each column contains the cost of distilling and implementing a logical Z-rotation gate from order l of the Clifford hierarchy T l to below a gate error rate µ calculated using the diamond norm. The raw magic state physical level error is assumed to be 0.1%.
is used as the standard metric for measuring the costs of gate sequences within the gate synthesis literature. This can be done by using a Z-rotation catalyst circuit shown in Fig. 1, which was introduced in [23] and presented in more detail in [22]. The circuit is similar to a synthillation parity-check circuit described in [25]. It utilises a |T l state and a small number of T gates to perform two T l gates on two different qubits while retaining the original |T l state. Costs can be calculated by recursively applying this circuit, assuming that all output gates at each recursive level are resourced (i.e. that no output gates are wasted). We calculate the costs using the Zrotation catalyst approach in two ways. The first assumes that output T l gates are directly applied to target qubits. The recurrence relation for the T -counts using this method can be obtained as where Cost[T 3 ] = 1. Solving this results in the average number of T gates required to implement a T l gate to be expressed as which is enumerated in Table 1a for 3 ≤ l ≤ 7. The second method of calculating the T -count using the Z-rotation catalyst approach applies the T l gates to |+ states, creating corresponding intermediate |T l states, which are then consumed to implement the gates via teleportation circuits. The recurrence relation for these costs can be obtained as where Cost[T 3 ] = 1, resulting in the expression which is enumerated in Table 1b for 3 ≤ l ≤ 7. This second method is more expensive since the teleportation circuit that consumes the |T l state to implement the T l gate requires a T l−1 correction gate to be applied 50% of the time. However, this method is more flexible in implementation since the outputted |T l states can be used at any time to implement T l gates onto any target qubits, enabling more options when instruction scheduling. A realistic employment of the Z-rotation catalyst approach would likely benefit from a combination of both direct application of T l gates and application via their intermediate |T l states. For the second approach of assigning resource costs, we use the average number of raw magic states to implement fault-tolerant T l gates from direct magic state distillation procedures. Resource costs have already been calculated for Yrotation gates R y (2π/2 l ) from the Clifford hierarchy by searching for optimal combinations of various distillation protocols with respect to target gate synthesis error rates [24]. For integer multiples R y (2πk/2 l ), the distillation protocols can be performed identically, hence they can be assigned the same cost. To follow convention, the Y -rotation gates are converted to Z-rotation gates with the same cost using the relation R z (θ) = HS † R y (θ)SH, since H and S := R z (π/2) have zero cost due to being elements of C 2 . These resource costs vary between orders of the Clifford hierarchy and are shown in Table 2.

Sequence Generation Algorithm
In this section, a sequence generation algorithm, based on Dijkstra's algorithm, is developed that generates a database of all cost-optimal single-qubit gate sequences below some maximum cost using arbitrary sets of universal base gates which have individually assigned cost values. We use this algorithm to help study the average cost of cost-optimal gate synthesis when including Zrotation gates from higher orders of the Clifford hierarchy as base gates. Due to the flexibility of this algorithm, it could be used as a subroutine within other synthesis algorithms. For example, it could be used as the base approximation step within the SK algorithm, enabling the SK algorithm to consider individual base gate costs when synthesising target gates. The sequence generation algorithm explores the space of sequence configurations using a tree expansion as shown in Figure 2, where each node corresponds to a gate and each path from the root node to any other node corresponds to a sequence of gates. Let B n be an element of P SU (2) corresponding to the base gate of node n in the sequence tree. A combined gate S n of node n is calculated by multiplying all nodes within the branch from the root down to n, i.e. S n := B n0 · B n1 . . . B n k , where n i is the i th node from the root node such that n 0 is the root and n k is node n. The Lie algebra generator of S n in the Pauli basis is of the form of a vector α n X + β n Y + γ n Z with real coefficients and can be written as (α n , β n , γ n ). Each vector represents a point in a ball of radius π/2 over the Pauli bases X, Y and Z. Thus each point within the ball is a geometrical location corresponding to a single-qubit gate.
The pseudocode for the algorithm is shown in Algorithm 1. It works by expanding nodes in a sequence tree (see Figure 2). All leaf (end) nodes of the sequence tree are stored in a minimum heap data structure which sorts the leaf nodes based on their corresponding sequence cost in increasing order. This determines the order of nodes to expand. The tree begins as a single identity gate at the root node which is added as the first element to the leaf node heap. At each iteration, the leaf node with the lowest sequence cost, i, is taken from the heap, which for the first iteration would be the identity gate node. The vector (α i , β i , γ i ) is calculated from the combined gate of the corresponding node's sequence. Before expanding a node in the sequence tree, we check whether another node with the same combined gate vector has already been expanded, using a hashset data structure. If the vector exists in the hashset, then the node is removed from the sequence tree and the algorithm proceeds to the next iteration. This repeats until a unique vector is found. When such a vector is found, it is added to the hashset for uniqueness checking in further iterations and the corresponding node in the sequence tree is expanded by generating a child node for each base gate. Each of these child nodes are added to the leaf node heap. To save computation time, adding a child node to the sequence tree and the heap can be limited to when their corresponding vectors are unique. Since vectors of sequences with lower costs are always added to the hashset before those with higher costs, the hashset must only contain vectors corresponding to sequences with the lowest cost among all sequences that produce equivalent combined gates. Thus, whenever a vector is successfully added to the hashset, the corresponding sequence must be cost-optimal. The cost-optimal vector and sequence pair can be stored in a data structure such as a k-d tree which can be used to approximate target gates by geometrically searching for nearest neighbours in the Figure 2: An example of a sequence tree used to relate logical base gates, gate sequences and combined gates for the sequence generation algorithm. A node n corresponds to a single-qubit base gate Bn and the root node corresponds to the identity gate B0 = I. A gate sequence corresponding to n is the sequence of logical base gates along the path from B0 to Bn. A combined gate Sn is calculated by multiplying all logical base gates within the gate sequence in sequence order. In this example, B1, B2 and B3 are logical base gates where In the sequence generation algorithm, the leaf node with the lowest sequence cost is expanded by adding a child node as a new leaf node for each gate in the set of logical base gates. All non-leaf nodes of the tree correspond to cost-optimal sequences and they can be thought of as the cost-optimal sequence database generated by the algorithm. Although all leaf nodes are depicted to be at the same depth in the tree, this is not always the case. At any point during the sequence generation algorithm, a path of relatively expensive logical base gates may be much shorter than a path of relatively cheap gates. Algorithm 1 Cost-optimal sequence generation 1: procedure GenerateSequences(baseGates, maxCost) 2:

sequenceDatabase ← new KdTree Node
To store the cost-optimal sequences geometrically 3:

sequenceTree ← new Tree Node
To relate nodes, sequences and combined gates 4:

sequenceTree.SetRoot(Identity gate)
Set the root node to the identity gate 5:

sortedLeafNodes ← new MinHeap Node
To order sequence tree leaf nodes based on sequence cost 6:

space of vectors.
There is a notable further optimisation that could be implemented into Algorithm 1. During the procedure, all non-leaf nodes within the sequence tree correspond to cost-optimal sequences with unique combined gate vectors, that is, each path starting at the root node and ending at any non-leaf node is a shortest path to the sequence's unique combined gate. To see how this could be helpful, first assume that an existing sequence tree needs to grow to a new maximum cost, such that the leaf nodes need to expand multiple times along the same branch. Instead of searching through every combination of base gates as children for a leaf node, the sequence tree itself can be used as a sieve by iterating child nodes from the root that are known to be shortest paths. The tree already contains optimal paths up to a certain depth, so this information could be used to help avoid the tree branches expanding in directions that produce nonoptimal paths to unique combined gates.
In Algorithm 1, cost-optimal sequences and their corresponding vectors are stored in a k-d tree which uses the Euclidean distance on the vectors to organise the data. Due to the periodic nature of the vectors, there is a small chance of failure in the k-d tree when searching for nearest neighbours to points close to the boundary. With computational overhead, the k-d tree may be modified to help overcome this [26], or a more appropriate data structure such as a vantage point tree [27,28] may be used instead. In general, further alternative data structures may be used such as the geometric nearest-neighbour access tree [29].

Synthesis Results
Algorithm 1 was computed using the sets of logical base gates described in Eq. 3 with the assignment of costs obtained from the two approaches of implementing base gates, where values are shown in (a) Sequences with below µ = 10 −5 logical base gate error  Figure 4: Cost-optimal sequence costs averaged over 5000 random target gates with respect to target gate synthesis logical error rates . The logical base gates used are specified in Eq. 3 with cost values (shown in Table 2) assigned as the average number of raw magic states required to distil and implement them to below a specified logical gate error. The synthesis logical errors are calculated using the trace distance (shown in Equation 8). Corresponding linear best fit values are shown in Table 4. The pattern of the data about the lines of best fit for each logical base gate set are similar between plots because for each of the logical base gate errors, the ratios of the base gate cost values between orders of the Clifford hierarchy are similar, hence the cost optimal sequences will be comparable. Tables 1 and 2. A database was generated that is in the form of a k-d tree of cost-optimal sequences up to some chosen maximum sequence cost. The sequences were organised in the k-d tree with respect to the vectors corresponding to their combined gates. For a given target gate G, gate synthesis was performed by searching for the lowest cost sequence among all nearest neighbours of G up to a chosen synthesis error (distance), , between their combined gates and G. The errors were computed using the trace distance defined as where S is a combined gate and G is the target gate. If such a sequence did not exist, then the database was further generated to a higher cost and the process was repeated until a sequence was found. Incrementally generating the cost-optimal sequence database in this manner helps avoid over generation.
For each set of base gates with individual costs calculated for each approach of implementing them, gate synthesis was performed on 5000 random target gates sampled from a uniform distribution for a variety of synthesis error rates (calculated using Eq. 8 with respect to the sequences' combined gates). Cost-optimal sequence T -counts calculated using the Z-rotation catalyst circuit approach for the two methods of assigning base gate costs are plotted against synthesised target gate error rates for each set of base gates in Figure 3. The corresponding linear best fit values for each set of logical base gates and corresponding cost values are shown in Table 3. We can compare the scaling factors of the fits between different sets of logical base gates to estimate changes in average sequence costs as the synthesis error approaches zero. For the Z-rotation catalyst circuit method that assumes all output gates are directly applied to target qubits (as opposed to using intermediate magic states), we find cost savings relative to Set 1 of 34 ± 3%, 42 ± 2%, 49 ± 2%, and 54 ± 3% for Set 2 , Set 3 , Set 4 , and Set 5 respectively, where uncertainties correspond to 95% confidence intervals. Data for a Set 6 that includes T 8 gates was also calculated, however no noticeable improvement was found with sequence cost values being almost identical to Set 5 resulting in a cost saving of 52 ± 3% relative to Set 1 . For the Z-rotation catalyst circuit method that assumes all output gates are applied to |+ states forming intermediate magic states before consuming them to perform the corresponding Z-rotation gate, we find cost savings relative to Set 1 of 29 ± 3%, 31 ± 3%, 31 ± 4%, and 31 ± 4% for Set 2 , Set 3 , Set 4 , and Set 5 respectively. These results show that if gate synthesis includes higher order Clifford hierarchy Z-rotation gates as base gates implemented using the Z-rotation catalyst approach, then a T -count saving of over 50% could potentially be achieved. Cost-optimal sequence raw magic state counts calculated using direct base gate distillation and implementation procedures are plotted against synthesised target error rates for each combination of base gates and cost values in Figure 4. Each of the four plots correspond to different resource costs of distilling and implementing the logical base gates with corresponding logical errors µ = 10 −5 , 10 −10 , 10 −15 and 10 −20 calculated using the diamond norm. The corresponding linear best fit values for each set of logical base gates are shown in Table 4 and corresponding cost values are shown in Table 2 (physical error rate assumed to be 0.1% in all calculations). The pattern of the data about their lines of best fit for each base gate set are similar between plots. This is because for each of the logical base gate errors, the ratios of the logical base gate cost values between orders of the Clifford hierarchy are similar, hence the cost optimal sequences will be comparable. For logical base gate errors µ = 10 −5 , 10 −10 , 10 −15 and 10 −20 , we find that Set 2 provides 23 ± 3%, 27 ± 2%, 30 ± 2% and 26 ± 3% reductions in scaling factor respectively compared to Set 1 . For µ = 10 −10 and 10 −15 , we find that Set 3 provides 30 ± 3% and 33 ± 2% reductions in scaling factor respectively compared to Set 1 , which are both approximately a further 3% savings compared to Set 2 . No further improvements are noticeable in our data for these assignments of cost values. These results show that for any error-correction scheme with distillation costs assigned according to Table 2, using Set 2 (which includes T 4 as logical base gates) instead of the standard Set 1 , reduces the average resource cost scaling factor with respect to the synthesis negative logerror, log( −1 ), by up to 30%. Additionally Set 3 can provide up to a further 3% reduction when compared to Set 2 . Each method of assigning individual base gate costs that were used in this work indicated that the resource requirements of synthesis algorithms may be considerably improved by including higher orders of the Clifford hierarchy as logical base gates and by optimising with respect to the individual costs of implementing them.   Table 3: Linear best fits with a confidence level of 95% for cost-optimal sequence costs averaged over random target logical gates with respect to the negative log-error, log( −1 ), for target gate synthesis calculated using the trace distance (shown in Equation 8). The sequences are constructed using logical base gates with cost values assigned according to Table 1. The corresponding plots are shown in Figure 3.   Table 4: Linear best fits with a confidence level of 95% for cost-optimal sequence costs averaged over random target logical gates with respect to the negative log-error, log( −1 ), for target gate synthesis calculated using the trace distance (shown in Equation 8). The sequences are constructed using logical base gates with cost values assigned according to Table 2. The corresponding plots are shown in Figure 4.

Modelling Gate Proportions
For the raw magic state approach of implementing base gates and the Z-rotation catalyst circuit method that uses intermediate magic states, the logical base gate sets Set 3 , Set 4 and Set 5 (see Eq. 3) were shown to provide only marginal resource savings for gate synthesis when compared with Set 2 (see Figs. 3b and 4), even though the sets contain many more logical base gates. To investigate this behaviour we develop a model in Appendix A for determining the proportion of sets of gates among all T l gates where l ≥ 3 within cost-optimal sequences approximating random target gates with specified gate costs. The proportions can provide insight into how the average sequence cost changes with respect to which T l base gates are included as logical base gates and what cost values are assigned. For logical base gates with non-zero proportion within sequences approximating target gates, we expect that by increasing their cost, their recalculated proportion will decrease and the average cost of these sequences will increase. Furthermore, for sets of logical base gates with relatively small proportions, the average sequence cost would only slightly increase if the set were to be excluded compared to sets of base gates with larger proportions. The model estimates the average proportion, p n , of T n logical base gates among all T l gates where l ≥ 3 from within cost-optimal sequences approximating random target gates to within sufficiently small synthesis errors . The construction is based on a unique canonical form [16] for sequences of logical base gates and is defined as where c and c are Clifford gates, H is the Hadamard gate, t m is the m th positioned Z-rotation gate from order three and above of the Clifford hierarchy, and M is the total number of t m gates in the sequence. This canonical form has the property that arbitrary gate sequences with distinct combined gates, where the sequences can consist of logical base gates from the Clifford gates and Z-rotations from orders three and above of the Clifford hierarchy, can be reduced to distinct sequences of this form. The gate proportion for T n , denoted p n , can be calculated by averaging the T n logical gate count over all possible sequences in this canonical form that are below a chosen maximum cost C (as detailed in Appendix A). That is, where c j is the logical base gate implementation cost for T j , k l is the number of T l gates within a particular sequence, |T l | is the number of gates within T l , and L is the order of the Clifford hierarchy to include Z-rotation gates up to. This calculation outputs values closely matching proportion results obtained using the sequence generation algorithm for random target gates, as shown in Figure 5. Figure 5a shows the summed proportions of all T 4 gates among T 3 ∪ T 4 gates over a variety of T 4 cost values for sequences consisting of Set 2 logical base gates. Figure 5b shows the summed proportions of all T 5 gates among T 3 ∪ T 4 ∪ T 5 gates over a variety of T 5 cost values for sequences consisting of Set 3 logical base gates. The other logical base gate costs are assigned values according to their distillation and implementation cost with a maximum logical base gate error of µ = 10 −15 as shown in Table 2. These results suggest that increasing the logical base gate implementation cost of a set T n drastically lowers the proportion of them found within the database of cost-optimal sequences. Thus they become less effective at reducing the average cost-optimal sequence costs since they are included within sequences less often. This is a simpler calculation compared to actually performing gate synthesis for many random target gates. The gate set proportions appears to give an indication for how useful the gate subset is among the rest of the base gates. We suspect there is potential that with some further research it could be used to help provide a quick approximation for how much the average synthesis cost reduces when including a base gate subset with specified cost values.  calculated using our model. The sequence generation algorithm outputs cost-optimal sequences approximating random target gates to within = 0.03 synthesis logical gate error under the trace distance (see Eq. 8), while the model outputs the proportion of a set of logical base gates within the space of all cost-optimal sequences below a maximum cost that produce distinct combined gates. Clifford gates are ignored in the calculations since they are assumed to have zero cost. Both plots show that the model data closely fit the corresponding results from the sequence generation algorithm. The data show that increasing the logical base gate distillation and implementation cost of a particular set Tn drastically lowers the proportion of them found within the generated cost-optimal sequences. Thus the set Tn with increased costs becomes less effective at reducing the average cost-optimal sequence costs, since they are found less frequently within the sequences. Logical base gate costs are assigned according to Table 2 with a logical base gate error of µ = 10 −15 calculated using the diamond norm. The red, green and blue vertical lines (ordered left to right) indicate the logical base gate distillation and implementation costs for T3, T4 and T5 respectively. (a) The summed proportions of T4 logical base gates among T3 ∪ T4 gates for cost-optimal sequences consisting of Set2 logical base gates. Logical base gates from T3 are fixed while the cost for T4 gates vary. (b) The summed proportions of T5 logical base gates among T3 ∪ T4 ∪ T5 gates for cost-optimal sequences consisting of Set3 logical base gates. Logical base gates from T3 ∪ T4 are fixed while the cost for T5 gates vary.

Discussion
We investigated the cost of sequences produced by cost-optimal single-qubit gate synthesis using logical base gates from a combination of Clifford gates and Z-rotation gates from higher orders of the Clifford hierarchy. An algorithm, based on Dijkstra's algorithm, was used to generate a database of cost-optimal sequences from arbitrary single-qubit universal sets of logical base gates with individually assigned costs. As base gates, combinations of Clifford gates and Z-rotation gates from various orders of the Clifford hierarchy were used with two approaches of implementing them. The first uses a recursively applied Z-rotation catalyst circuit that utilises a temporary ancilla qubit, a small number of T gates and a Z-rotation state to apply two Z-rotation gates of the same angle on two separate qubits while retaining the original Z-rotation state. We calculate average T -count costs for this approach using the following two methods: all output gates of the catalyst circuits are applied directly to target qubits; and each output gate is first applied to a |+ state to form an intermediate magic state, which is then consumed to implement the corresponding gate via a teleportation circuit. The second approach of implementing base gates is through magic state distillation and implementation circuits that can assign costs as the average number of raw magic states used to implement them in error-correction codes up to specified logical error rates. After assigning base gate costs using each method, gate synthesis was performed by finding nearest neighbours within the database of cost-optimal sequences in the Pauli vector space corresponding to combined gates of sequences.
Using the Z-rotation catalyst approach with directly applied output gates to assign gate costs, we found that by including the higher order Clifford hierarchy Z-rotation gates along with the standard Clifford+T set of base gates, there was a reduction in synthesis cost when compared to only using the Clifford+T base gate set. The average cost-optimal sequence T -counts reduced by 34 ± 3%, 42 ± 2%, 49 ± 2%, and 54 ± 3% for the accumulative inclusion of the fourth, fifth, sixth, and seventh orders respectively. When using the same approach but with all output gates being applied via intermediate magic states, the average cost-optimal sequence T -counts reduced by 29 ± 3%, 31 ± 3%, 31 ± 4%, and 31 ± 4% for the accumulative inclusion of the fourth, fifth, sixth, and seventh orders respectively. Each average T -count calculated using the catalyst circuit approach assumes that every output gate of all recursive levels of the circuit are resourced such that no output gates are wasted. The procedure also assumes that there are sufficient numbers of ancilla qubits and Z-rotation catalyst states for smooth implementation of the gate sequences resulting from synthesis. A realistic employment of the approach would likely use a combination of direct application of output gates and the use of intermediate magic states. This is because direct application is cheaper with respect to T -count, however the intermediate magic states help make the implementation more flexible since they can be consumed at any time to implement the corresponding gate onto any target qubit. Nevertheless, these results show that there is potential for the average T -count to decrease by over 50% when performing gate synthesis with higher order Clifford hierarchy Z-rotation base gates that are implemented using this approach, when compared to cost-optimal synthesis using only the Clifford+T base gate set.
By instead using the magic state distillation approach with base gate costs assigned as the number of raw magic states, we found that including the fourth order Z-rotation gates from the Clifford hierarchy along with the standard Clifford+T gate set decreased the average cost-optimal sequence costs by up to 30 ± 2%. We observe a reduction of up to 33 ± 2% when additionally including the Z-rotation gates from the fifth order. No noticeable improvement is observed when additionally including higher order Z-rotation base gates up to the seventh order. Although these savings are not quite as large as what may be possible with the Z-rotation catalyst approach, the magic state distillation approach does not require an accessible collection of Z-rotation catalyst states to be stored throughout the computation. The implementation circuit for the distilled Zrotation magic state does require the application of a double angled Z-rotation gate as a correction 50% of the time. However, this correction gate can ideally be generated as it is required, so that every possible angled rotation does not need to be stored in advance. Also, the number of raw magic states is only a rough approximation for the actual resource costs of implementation. A precise calculation would be an extensive task that considers a variety of factors such as qubits count, circuit depth, magic state distillation cost and details of the error-correcting implementation.
We investigated the lack of further improvement found when including Z-rotation gates from higher than the fourth order of the Clifford hierarchy when using the direct magic state distillation approach and the Z-rotation catalyst circuit approach with output gates being applied via intermediate magic states. A model was developed that estimates the proportion of logical base gates within sequences approximating random target gates. This model assumes that each Z-rotation gate from orders three and above of the Clifford hierarchy have equal proportions when assigned equal cost values, that is, the gate operations have equal usefulness for approximating random target gates for the purposes of gate synthesis. The proportion estimations were shown to closely fit the data obtained using the sequence generation algorithm on random target gates. This suggests that the lack of observed cost reduction when using higher order logical base gates is due to there being far less numbers of them at their assigned costs within all cost-optimal sequences generated up to the chosen maximum sequence cost. Thus the frequency of the base gates being used for synthesis of random target gates is low, leading to a low level of influence over the average resource costs overall. The model provides a simple method, without needing to generate the full database of sequences, for estimating these gate proportions with each order of the Clifford hierarchy being assigned individual cost values. ). This plot indicates that the logical base gates are almost equivalently useful in approximating random target gates using cost-optimal gate synthesis.
by the data in Figure A.1. The figure shows that when each logical base gate is given equal costs, the sequence generation algorithm generates a database of gate sequences with each gate having approximately the same proportions, where the proportions slowly decrease for increasing order. We do not expect these proportions to significantly change for larger sequence costs (or smaller synthesis error thresholds ) since the logical base gate proportions are approximately constant for sufficiently large maximum sequence costs. This can be seen in Fig. A.2 for the case of T 5 logical base gates from within Set 3 generated by the sequence generation algorithm for random target gates. Assume we have a database of cost-optimal gate sequences that have been generated up to a chosen maximum cost with individually assigned implementation costs for each set of logical base gates T l where l ≥ 3. We will calculate the proportion of T n gates among all sequences within the database. For simplicity, let logical gates from any set T l for l ≥ 3 be called t gates. Using a unique canonical form [16] for sequences consisting of the Clifford gates and combinations of T l gates, arbitrary gate sequences can be reduced to the form where c and c are Clifford gates, t m is the m th positioned t gate in the sequence, and M is the t-count. For a particular sequence, let the number of t gates from T l be denoted by k l . It follows that each sequence consisting of gates from up to order L of the Clifford hierarchy satisfies (noting where c l is the cost assigned to logical gates from T l and C is the maximum cost of the database of gate sequences. It will be useful to denote the number of t gates from order l to L of the Clifford hierarchy as noting that K 3 is the t-count, M that appears in Eq. 11. The aim is to calculate the proportion of T n gates among all gates in sequences within the database. We begin by counting the total number of possible sequences that can be formed given The proportion of T5 logical base gates among T3 ∪ T4 ∪ T5 gates calculated using the combinatorial model for all cost-optimal sequences below a maximum sequence cost that produce distinct combined gates. The logical base gate cost values are assigned according to Table 2 for a logical base gate error threshold of µ = 10 −15 under the diamond norm. This plot shows that the proportion of T5 gates becomes approximately constant for sufficiently large maximum sequence costs. a set of t gate counts {k l }| L 3 . Then the total number of possible sequences can be summed by iterating through every combination of possible sets {k l }| L 3 that satisfy Eq. 12 with their assigned base gate costs. Once this expression is determined, it can be extended to calculate the number of T n gates and the total number of gates, which can then be used to calculate the proportions. For sequences of t-count K 3 , the number of permutations of k l gates within K 3 gate locations is (#Permutations(k l , K 3 )) := Let |T l | be the number of distinct Z-rotation gates within order l of the Clifford hierarchy, for example, |T 3 | = 2 since T 3 = {T, T † } (up to global phase). Then for each permutation, there are |T l | k l unique combinations of assigned T l logical base gates within the permutation. Thus, the total number of configurations for k l number of gate locations with |T l | variations in a sequence of t gate count K 3 is After assigning gates to k l locations, there are K 3 − k l locations remaining within the sequence. The strategy from here is to iteratively count the total number of configurations from l = 3 to L by updating the number of remaining locations at each step, which now updates as K l+1 = K l − k l . So for the second iteration, the number of configurations of k l+1 gates with |T l+1 | variations within remaining locations K l+1 of a given configuration from the assigned k l number of T l gates is γ(k l+1 , |T l+1 |, K l+1 ), leading to a total of γ(k l , |T l |, K l )γ(k l+1 , |T l+1 |, K l+1 ) configurations for k l and k l+1 numbers of T l and T l+1 gates respectively in sequences of t-count K l . Thus the total number of configurations for a set of t gate counts k = {k 3 , k 4 , . . . , k L } in sequences of t-count K 3 (containing t gates up to order L of the Clifford hierarchy) is To count the total number of sequences, we sum over all configurations for each assignment of k satisfying Equation 12. We begin by determining the maximum allowable values for each k l with respect to already specified lower order t gate counts {k j }| l−1 3 . The maximum possible value for k 3 is C/c 3 . Given a specified k 3 , the maximum value for k 4 is (C − c 3 k 3 )/c 4 . By continuing this pattern, given a set of t gate counts {k 3 , k 4 , . . . , k l−1 }, the maximum value for k l is So now the total number of sequence configurations with logical base gate costs c and maximum sequence cost C can be calculated as Since the number of T l logical gates within a particular sequence is k l , the total number of T l gates within all possible sequences below the maximum cost C is calculated by multiplying k l to each term in the summation, the total number of gates can be calculated in a similar way. Thus, the proportion of T n gates can be calculated as the weighted sum