Time-optimal multi-qubit gates: Complexity, efficient heuristic and gate-time bounds

,


Introduction
Any quantum computation requires to decompose its logical operations into the platform's native instruction set.The performance of the computation depends heavily on the available instructions and their implementation in the quantum hardware.In particular for early quantum devices a major challenge is posed by their short decoherence times, which limits the runtime of a quantum computation significantly.Therefore, it is not only necessary to optimize the number of native instructions but also their execution time.
Ising-type interactions give rise to an important and rich class of Hamiltonians ubiquitous in several quantum computing platforms [1][2][3][4][5].Previously, we have utilized these Ising-type interactions in a new synthesis method [6].In particular, we have considered the problem of synthesizing time-optimal multi-qubit gates on a quantum computing platform that supports the following basic operations: (I) single-qubit rotations can be executed in parallel, and (II) it offers a fixed Ising-type interaction.
The corresponding synthesis of global ZZ-gates (GZZ gates) is given by the minimization of the overall gate time, which can be written as a linear program (LP) [6].This LP is exponentially large in the number of qubits.It has been unclear whether such multi-qubit gates can be synthesized in a computationally efficient way while keeping the gate time optimal.
In this work, we prove that this synthesis problem is NP-hard and provide a close-to-optimal efficient heuristic solution.To establish this hardness result, we draw the connection between the synthesis problem and graph theory.The cut polytope is defined as the convex hull of binary vectors representing the possible cuts of a given graph [7].We provide a polynomial time reduction of the 2 Synthesizing multi-qubit gates with Ising-type interactions In this section, we give a short introduction to the synthesis of time-optimal multi-qubit gates.For more details we refer to the first two sections of Ref. [6].
On the abstract quantum computing platform with n qubits specified by the requirements (I) and (II) above, interactions between the qubits are generated by an Ising-type Hamiltonian where σ i z is the Pauli-Z operator acting on the i-th qubit.Note, that diagonal terms, where i = j, are excluded since they only change the Hamiltonian by an energy offset.By J we denote the symmetric matrix with entries J ij in the upper triangular part and vanishing diagonal.We call J the physical coupling matrix.
Conjugating the Hamiltonian H ZZ with X gates on the qubits indicated by the binary vector where we define the encoding m := (−1) b entry wise.The sign of the interaction between qubit i and j is given by m i m j ∈ {−1, +1}.We call H(m) the encoded Hamiltonian.
Given time steps λ m ≥ 0 during which the encoding m is used, we consider unitaries of the form m e −iλmH(m) = e −i m λmH(m) =: e −iH , ( where we used that the diagonal Hamiltonians H(m) mutually commute.For all possible encodings m ∈ {−1, +1} n we collect the time steps λ m in a vector λ ∈ R 2 n and interpret t = m λ m as the total gate time of the unitary e −iH , implemented by the sequence of unitaries (3).Moreover, we use the symmetry (−m)(−m) T = mm T (4) to reduce the degrees of freedom in λ from 2 n to 2 n−1 by adding up λ m + λ −m to a single time step.
The so generated unitary is the time evolution operator under the total Hamiltonian where we have defined the total coupling matrix and used the linearity of the Hadamard (entry-wise) product •.By construction, A is a symmetric matrix with vanishing diagonal.Let us define the n 2 -dimensional subspace of symmetric matrices with vanishing diagonal by For A ∈ Sym 0 (R n ), we define an associated multi-qubit gate GZZ(A), where GZZ stands for "global ZZ interactions", To determine which matrices can be decomposed as in Eq. ( 6), we denote the non-zero index set of a symmetric matrix as nz(M ) := {(i, j) | M ij ̸ = 0, i < j}.Then, the subspace of matrices A ∈ Sym 0 (R n ) that can be decomposed as in Eq. ( 6) is exactly given by the condition nz(A) ⊆ nz(J), which we assume to hold from now on.Thus, all-to-all connectivity enables to decompose any coupling matrix A ∈ Sym 0 (R n ) but is not strictly required by our approach.We call the number of encodings m needed for the decomposition the encoding cost of GZZ(A), and m λ m the total GZZ time.Note, that both quantities depend on the chosen decomposition.
It is convenient to abstract the following analysis from the physical details given by J.For a matrix A ∈ Sym 0 (R n ) with nz(A) ⊆ nz(J), its possible decompositions are in one-to-one correspondence with the decompositions of the matrix M := A ⊘ J ∈ Sym 0 (R n ) where We further define the linear operator represented in the standard basis by a matrix be the (row-wise) vectorization of the upper triangular part of the matrix input such that the columns of V are given by v(mm T ).Our objective is to minimize the total gate time and the amount of m's needed to express the matrix M ∈ Sym 0 (R n ).To this end we formulate the following linear program (LP): where 1 = (1, 1, . . ., 1) is the all-ones vector such that 1 T λ = m λ m .A feasible solution is an assignment of the variables that fulfills all constraints of the optimization problem and an optimal solution is a feasible solution which also minimizes/maximizes the objective function.A standard tool in convex optimization is duality [14] which will be used in Section 7. The dual LP to the LP (12) reads as follows: Here, inequalities between vectors are to be understood entry-wise.A simple, but important fact is the following: If λ * is an optimal solution to the primal LP (12), then any feasible solution y to the dual LP (13) provides a lower bound as ⟨M, y⟩ ≤ 1 T λ * .Moreover, the feasibility of the LP (12) implies that we have strong duality: if y * is a dual optimal solution, then we have equality between the optimal values, ⟨M, y * ⟩ = 1 T λ * .

Main results
We want to highlight our main contributions.First, we provide the hardness results for the synthesis of time-optimal GZZ gates.(12), is NP-complete.

Theorem (Theorem 6). The decision version of the LP
We say that the synthesis of time-optimal multi-qubit gates (solving LP (12)) is NP-hard in the sense of the function problem extension of the decision problem class NP [15].We circumvent the hardness of the time-optimal multi-qubit gate synthesis by providing an explicit construction of GZZ gates realizing next neighbor coupling with constant gate time and linear encoding cost.
The assumption of a constant subdiagonal of J is physically motivated and can be realized in an ion trap [16].
Theorem (Theorem 10, informal).Let the subdiagonal of J and A be a constant with values c and φ respectively.Then GZZ(A) on n qubits has the encoding cost of d ≤ 2n and constant total GZZ time 2φ/c.This total gate time is optimal.
In Section 6 we define Algorithm 3 and introduce LP (41), a polynomial time heuristic to synthesize GZZ gates with small, but not necessarily minimal, gate time.This heuristic does not rely on any further assumptions and is applicable for arbitrary J, A ∈ Sym 0 (R n ).In Section 8 we show numerically that this heuristic leads to GZZ gate times close to the optimum while solving the synthesis problem much faster.
Furthermore, we proof lower and upper bounds on the GZZ gate time.
Theorem (Theorem 15).The optimal total gate time of GZZ(A) with A ∈ Sym 0 (R n ) is lower and upper bounded by Here, the upper bound scales quadratic in n for a constant matrix M = A ⊘ J.This upper bound is loose in the sense that it also holds for the GZZ gates constructed by the heuristic in Section 6.Therefore, we tighten the upper bound to a linear scaling in n.
Conjecture (Conjecture 16).The optimal gate time of GZZ(A) with A ∈ Sym 0 (R n ) is tightly upper bounded by We provide evidence for this conjecture using an explicit construction (Theorem 21) that realizes the conjectured upper bound for any n, as well as numerical evidence for n ≤ 8 (Fig. 1).Unfortunately, we were not able to prove this result and state the challenges in Appendix A.

Synthesizing time-optimal GZZ gates is NP-hard
In this section, we investigate the complexity of solving the gate synthesis problem stated as LP (12).We observe that LP (12) is an optimization over the convex cone generated by which is the set of outer products generated by all possible encodings m.Due to the symmetry Eq. ( 4) we can uniformly fix the value of one entry of m.We chose the convention m n = +1.
In the literature E n is also known as the elliptope of rank one matrices [17].In the following, we consider the polytope and show the connection to graph theory, in particular to the cuts of graphs.
Definition 1 (cut polytope [7]).Let K n = (V n , E n ) be a complete graph with n vertices.Denote δ(X) the set of all edges with one endpoint in X ⊂ V n and the other endpoint in its complement X, i.e., δ(X) defines the cut between X and X.Let χ δ(X) ∈ {0, 1} |En| denote the characteristic vector of a cut, with χ δ(X) e = 1 if e ∈ δ(X) and χ δ(X) e = 0 otherwise.We define the cut polytope as the convex hull of the characteristic vectors Lemma 2. For all n, CUT n is isomorphic to conv(E n ).
Proof.For each X ⊂ V n we set m i = +1 if i ∈ X and m i = −1 if i ∈ X. Note, that there are 2 n−1 different pairs of X and X.We then have m i m j = −1 if i ∈ X and j ∈ X (or the other way around), and m i m j = +1 if i, j ∈ X or i, j ∈ X.So the characteristic vector can be written as χ δ(X) e = (1 − m i m j )/2 for each edge e ∈ E n connecting vertices i and j.This is clearly a bijective affine map between the vertices and thus CUT n is isomorphic to conv(E n ).
The following decision problems are membership problems, where the task is to decide if a given element x belongs to a set or not.In our case x is a vector and the set is a polytope.

Problem 3 (CUT n membership).
Instance The adjacency matrix M ∈ Sym 0 (Q n ≥0 ) of a weighted undirected graph with non-negative weights.
It is well known that the membership problem of the cut polytope, Problem 3, and Problem 4 are NP-complete [8,18,19].Next, we state the decision version of our gate synthesis optimization.
Question Is there a decomposition M = i λ i r i with r i ∈ E n such that λ i ≥ 0 and i λ i ≤ K? Theorem 6. Problem 5, which is the decision version of the LP (12), is NP-complete.
Proof.A solution of Problem 5 can be verified in polynomial time since there always exists a decomposition i λ i r i of M with minimal i λ i which has at most n 2 = n/2(n − 1) non-zero terms [6,Proposition II.3].Therefore, Problem 5 is NP.
To show that Problem 5 is NP-hard, we construct a polynomial-time mapping reduction from Problem 4 to Problem 5. Given the matrix M ∈ Sym 0 (Q n ) and a constant K ∈ Q ≥0 as an instance for Problem 5, let λ i ≥ 0 be the positive coefficients of the decomposition.If we find i λ i < K, then we can always add additional λ's such that equality holds.We choose the additional λ's as the coefficients of the decomposition for the all-zero matrix, see e.g.Lemma 8 below with k = 1 n for an explicit construction.We define the matrix M ′ := M/K and the positive coefficients 5 Time-optimal GZZ synthesis for special instances Although, solving the LP (12) is NP-hard we present explicit optimal solutions for certain families of instances which is equivalent to constructing time-optimal GZZ gates.The constructions of this section yield a qubit-independent total GZZ time which satisfies the optimal lower bound of Lemma 7 below.Moreover, we show that some GZZ gates can be synthesized with an encoding cost independent of the number of qubits.However, most of these constructions assume constant values for the elements of the band-diagonal of the physical coupling matrix J.These assumptions are relaxed throughout this section providing explicit optimal solutions for physically relevant cases.These results build the foundation for the heuristic algorithm for fast GZZ gate synthesis in the next section.
By ∥M ∥ ℓp we denote the ℓ p -norm of a symmetric matrix M , which is given as the ℓ p -norm of a vector v(M ) containing all lower/upper triangular matrix elements in some order.First, we proof a lower bound on the optimal total GZZ time which can be used to verify time-optimality.

Lemma 7. For any
Proof.The lower bound can be verified by the fact that the matrix representation V of the linear operator in Eq. ( 10) only has entries ±1 and that λ * is non-negative.Thus, it holds that Next, we provide calculation rules for the coupling matrix A ∈ Sym 0 (R n ) of the GZZ(A) gate.These rules are inherited from matrix exponentials.Let , where the coupling matrices can be decomposed as In Item (iii) we can also express γ = λ⊗β with the Kronecker product.Note, that these generated GZZ gates are not necessarily time-optimal.We denote the k × k identity matrix by 1 k , the k dimensional all-ones vector by 1 k and the k × k matrix of ones with vanishing diagonal by With the next two lemmas, we bound the encoding cost and total gate time for special cases of Item (i) and Item (ii), where the total gate time is constant.These results will provide a basis for all other constructions.The following result constructs total coupling matrices A ∈ Sym 0 (R n ) on arbitrary subsets of qubits.Lemma 8. Let the coupling matrix J be constant, i.e.
For any φ ∈ R and s ∈ Z ≥0 the GZZ(A) with the matrix with k i ≥ 1 for i = 1, . . ., s has the encoding cost of d = 2 ⌈log 2 (s)⌉ ≤ 2s and constant total GZZ time φ/c.This total gate time is optimal.
Proof.W.l.o.g.we set c = φ = 1.We have n := s i=1 k i qubits.Note, that E ki = E 1 = 0 only contributes to an entry in the diagonal of A, and hence this qubit does not participate in the GZZ(A) gate.We denote the d × d Hadamard matrix by H d×d and the matrix consisting of its first s columns by H d×s where s ≤ n.The orthogonality of the columns of any Hadamard matrix yields (H d×s ) T H d×s = d 1 s .Replacing each i-th column by k i copies of it we obtain the d × n-matrix H d×n k from H d×s .Then, we attain the diagonal block matrix structure We set the diagonal elements on the left-hand side to zero, since they only contribute to an energy offset in the Hamiltonian.Take each row of H d×n k as a possible vector m ∈ {−1, 1} n to construct the total coupling matrix i.e., the time steps have been chosen as . Clearly, we have m λ m = 1.If we take the constants c and φ into account, we just multiply Eq. ( 22) by φ/c which gives a total GZZ time of φ/c.Furthermore, this total GZZ time is optimal since ∥M ∥ ℓ∞ = ∥A ⊘ J∥ ℓ∞ = φ/c satisfies the lower bound of Lemma 7.
The encoding cost of GZZ(A) considered in this lemma can be reduced if redundant encodings m = m ′ ∈ {−1, 1} n are present by adding the corresponding time steps λ m + λ m ′ .It can be further reduced by using the Hadamard conjecture [20,21].If the Hadamard conjecture holds, then there exists Hadamard matrices of any dimension divisible by 4. It is known that the Hadamard conjecture holds for dimensions s ≤ 668 [22,23], thus the encoding cost can be reduced to d = 4⌈s/4⌉ ≤ s − 4 in this regime.The encoding cost of the following Lemma 9, Theorems 10 and 11 and the efficient heuristic in Section 6 can be reduced in the same fashion.
The assumption in Lemma 8 of a constant coupling matrix, J i̸ =j = c, is physically unreasonable.Therefore, we relax this assumption for block sizes k i = 2 which corresponds to pairwise next neighbor couplings.We use Lemma 8 to construct time-optimal GZZ gates for a certain family of coupling matrices.

Lemma 9.
Let n be even and the coupling matrix J be constant on the first subdiagonal (the other elements are arbitrary), i.e.
has the encoding cost of d = 2 ⌈log 2 (n)−1⌉ < n and constant total GZZ time φ/c.This total gate time is optimal.
Proof.Since the matrix E 2 only has non-zero entries in the first subdiagonal, the coupling matrix J only needs to be constant there.The claim follows immediately from Lemma 8 by setting Clearly, the total GZZ time is optimal since it saturates the lower bound of Lemma 7.
Lemma 9 guarantees an encoding cost of d < n, which is a quadratic saving compared to the general LP solution with encoding cost n 2 .We note, that corresponds to parallel ZZ(φ) gates, which find applications in simulating molecular dynamics [24].
The assumption of a constant subdiagonal of J can be realized in an ion trap platform by applying an anharmonic trapping potential [16].By combining (i), Lemma 8 and Lemma 9 we obtain the GZZ gate for next neighbor coupling.
Theorem 10.Let the subdiagonal of J be constant, i.e.
Then GZZ(A) on n qubits with has the encoding cost of d ≤ 2n (for n > 4) and constant total GZZ time 2φ/c.This total gate time is optimal.
Proof.We set c = φ = 1 w.l.o.g.For now, we assume that n is even.Then Again, the diagonal entries do not contribute to the interactions.The first term can be implemented, using Lemma 9, with the encoding cost d 1 = 2 ⌈log 2 (n)−1⌉ and the GZZ time of 1.The second term can be implemented, using Lemma 8, as where s = n/2 + 1, k 1 = k s = 1 and k i = 2 for i = 2, . . ., n/2, with the encoding cost d 2 = 2 ⌈log 2 (n+2)−1⌉ and the GZZ time of 1. Adding the encoding costs d = d 1 + d 2 and GZZ times of both terms yields the desired result.If n is not even, then repeat the previous steps for n + 1 but in the end reduce the dimension of all the resulting m by discarding the last entry.This construction corresponds to a feasible solution of the primal LP (12) with the objective function value 2. To show optimality it suffices to construct a feasible solution for the dual LP (13) with the objective function value of 2. First, consider the case n = 3 with the total coupling matrix and the matrix representation of the linear operator as in Eq. (10).A feasible dual solution satisfying V T y ≤ 1 is y = (1, −1, 1) T .Thus, we can verify optimality for n = 3 since the objective function value is ⟨A, y⟩ = v(A) T y = 2. Now, we consider arbitrary n > 3. Extending the dual solution for the case n = 3 with zeros y = (1, −1, 1, 0, . . ., 0) T does not change the objective function value ⟨A, y⟩ = v(A) T y = 2.Such an extended dual solution is still feasible since V T restricted to the first three columns only has rows which are already contained in Eq. ( 29) due to the symmetry of Eq. ( 4).
Algorithm 1 is the pseudocode implementation of Theorem 10.It takes the number of qubits n, the constant value c of the subdiagonal of J and the factor φ of A as input and returns the sparse vector λ, containing the time steps.The encodings m are given by the indices of the non-zero elements λ m ̸ = 0.

Input: n, c, φ
Initialize Output: λ ▷ Can efficiently be saved in a sparse data format The following theorem does not require any additional assumptions on J.It shows, how the LP (12) can be supplemented with the explicit solution to exclude certain qubits.
Theorem 11 (Excluding qubits).Let N be the total number of qubits on the quantum hardware and n = N − s be the participating qubits in the GZZ gate.Synthesize the GZZ(A) gate with A ∈ Sym 0 (R n ), using the LP (12).Then, the total encoding cost (on N qubits) is at most n 2 2 ⌈log 2 (s+1)⌉ and the total GZZ time, 1 T λ * , (on N qubits) is the same as for the LP run on n qubits.
Proof.Assume w.l.o.g. that all qubits to be excluded are at the end of the qubit array.Let k 1 = n and k 2 , . . ., k s+1 = 1.Using Lemma 8 we obtain an encoding cost of d 1 = 2 ⌈log 2 (s+1)⌉ and total GZZ time 1 to generate the matrix Solving the LP (12) for a matrix A ∈ Sym 0 (R n ) yields the encoding cost d 2 = n 2 and the total GZZ time 1 T λ * .We define the extension of A ∈ Sym 0 (R n ) by A 2 ∈ Sym 0 (R N ).The extension can be done by appending s arbitrary elements in {−1, +1} to all vectors m ∈ {−1, +1} n given by the LP (12).Clearly, By (iii), the total encoding cost is d 1 d 2 and the total GZZ time is Consider now an arbitrary coupling matrix J. Then where Ã = A ⊘ J is decomposed by the LP (12).
Algorithm 2 takes the total number of qubits N , the total coupling matrix A (on n qubits) and the set Z := {i ∈ [N ] | exclude qubit i} as input and returns the sparse vector γ, containing the time steps.The encodings w are given by the indices of the non-zero elements γ w ̸ = 0.
Algorithm 2 Excluding qubits.Input: N, A, Z (set of qubit indices to be excluded) Output: γ ▷ Can efficiently be saved in a sparse data format We showed explicit constructions of time-optimal GZZ gates for total coupling matrices A ∈ Sym 0 (R n ) with diagonal block structure and next neighbor couplings.The resulting GZZ gates have a constant gate time and require only linear many encodings to be implemented.

Efficient heuristic for fast GZZ gates
In this section, we build on the results of Lemma 8 to derive a heuristic algorithm for synthesizing GZZ(A) gates with low total gate time for any A ∈ Sym 0 (R n ).This algorithm runs in polynomial time as opposed to the general LP (12), which we have shown in Theorem 6 to be NP-hard.The runtime saving is due to the restriction of the elliptope E n in Eq. ( 16), with exponential many elements, to a set with polynomial many elements.This restriction yields a polynomial sized LP which can be solved in polynomial time.In practice, the simplex algorithm has a runtime that scales polynomial in the problem size [25].
Recall the modified Hadamard matrix H d×n k defined in the proof of Lemma 8, where we used the rows of H d×n k as encodings m to generate block diagonal target coupling matrices under some assumptions.Here, s is the number of block matrices on the diagonal of the target matrix, k ∈ N s contains the dimensions for each block and d = 2 ⌈log 2 (s)⌉ is the required number of encodings to construct such a block diagonal matrix.From now on, we only consider k = (j, 1 . . . see Eq. ( 21).The requirement that i k i = n implies that such a vector k has s = n − j + 1 entries.Permuting the columns of H d×n k results in the same permutation of the rows and columns of the right-hand side of Eq. (34).We denote the set of all column-permuted H d×n k by C (j) .A specific element of C (j) is denoted by H d×n r , where r ∈ N j is an ordered multi-index r 1 < • • • < r j indicating which columns of H d×n r are identical.For example, r = (2, 5, 6) indicates that the columns of H d×n r with indices 2, 5 and 6 are identical, i.e. replacing column 5 and 6 with column 2. This notation will be useful later.Definition 12.For any j ∈ {2, 3, . . ., n}, we define the restricted elliptope Further we define We choose the definition in Eq. (35) similar as in Eq. ( 16).Next, we show the size scaling of the restricted elliptopes.This directly translates to the size and runtime of the heuristic synthesis optimization.
Proposition 13.For any j ∈ {2, 3, . . ., n}, the number of different encodings m generating the restricted elliptope E We denote the convex cone generated by a set V by With that, we are ready to present the main result of this section.

Theorem 14. cone(E
(2) Proof.W.l.o.g.we can assume d = n and denote H d×d r ∈ C (2) by H d (r1,r2) with the property where e (r1,r2) is an element of the standard basis for symmetric matrices with vanishing diagonal.
It is left to show that Sym 0 (R n <0 ) ⊆ cone(E n ), i.e., that also symmetric matrices with negative entries are in the convex cone.To show this inclusion we define H d (r1,−r2) similar as H d (r1,r2) except the duplicate column at r 2 is multiplied by −1 such that We have to show that for each row m ∈ rows(H d (r1,−r2) ) there exist r1 and r2 such that m ∈ rows(H d (r1,r2) ).This can be verified straightforwardly for d = 4 by checking all rows.W.l.o.g.we show that the hypothesis holds for any H 2d (r1,−r2) by assuming it holds for H d (r1,−r2) .The Sylvester-Hadamard matrix is constructed inductively according to We consider three cases for H 2d (r1,−r2) .(r1,r2) with r1 , r2 ∈ [d+1, 2d] and r2 = r 2 since the column r 2 is negated which is equivalent to just duplicating a column of −H d .

Case 1.
We have shown that for each row m ∈ rows(H d (r1,−r2) ) there exist r1 and r2 such that m ∈ rows(H d (r1,r2) ).Therefore, Sym n ).The last equality follows from the definition of cone and E (2) n .
Theorem 14 shows that the constraint of LP (12) can always be fulfilled only considering n .Similar to Eq. ( 10) we define the restricted linear operator 2 ))×h .We define the restricted LP j to be Algorithm 3 summarizes the steps to construct E

[j]
n and therefore the matrix representation of the restricted linear operator V [j] .In practice, Algorithm 3 has to be executed only once per number of qubits n since the constraints of LP (41) can be fulfilled for all M ∈ Sym 0 (R n ).This is due to the fact that E Therefore, the restricted LP j is also of polynomial size for a fixed j.Increasing j leads to better approximations due to the enlarged search space for the optimal solution.Note, that the runtime of the mixed integer program (MIP) defined in [6, Section 2.2.2] also benefits from using E As mentioned in Section 5 the dimension of the Hadamard matrices in Eq. ( 34) can be reduced to d = 4⌈(n − 1)/4⌉ ≤ n − 5 if we assume that the Hadamard conjecture holds.Therefore, the runtime of the restricted LP j is reduced as well if such Hadamard matrices are used.In Section 8 we numerically benchmark the heuristic algorithm with and without the reduced runtime of the restricted LP j .

Algorithm 3 Constructing E
7 Bounds on the total GZZ time Our main analytic result (Theorem 15) is that the optimal total GZZ time 1 T λ * is lower and upper bounded by the norms ∥M ∥ ℓ∞ and ∥M ∥ ℓ1 , respectively.Note, that for a dense matrix M , its norm ∥M ∥ ℓ1 scales quadratic with the number of qubits n.We conjecture an improved upper bound on the total GZZ time for dense M that scales at most linear with the number of qubits n.We support this conjecture with explicit solutions for the LP (12) reaching this bound for any n and numerical results validating the conjecture for n ≤ 8.

Theorem 15. The optimal total gate time of GZZ(A) with
where M := A ⊘ J. Equality holds for the lower bound for all matrices M = Cmm T for any m ∈ {−1, +1} n and C ≥ 0.
Proof.The lower bound has been shown in Lemma 7. Equality in the lower bound holds for M = Cmm T by setting λ m = C = ∥M ∥ ℓ∞ and λ m ′ = 0 for all m ′ ̸ = m.We use the explicit construction of the standard basis elements for symmetric matrices from the proof of Theorem 14 to show the upper bound.To be precise, we have where r = (i, j) if M ij ≥ 0 or r = ( ĩ, j) if M ij < 0 as in Eqs. ( 38) and (39), respectively.We define λ (i,j) with the entries λ According to Lemma 8 we have 1 T λ (i,j) = |M ij |.Adding λ (i,j) for all i < j yields the upper bound ∥M ∥ ℓ1 .
These bounds get tighter the sparser M is.If M has only one non-zero value, then clearly ∥M ∥ ℓ∞ = 1 T λ * = ∥M ∥ ℓ1 .Furthermore, these bounds also hold for the heuristic, which we presented in Section 6.
Next, we state our conjecture that the optimal gate time of GZZ(A) scales at most linear with the number of qubits.

Conjecture 16. The optimal gate time of GZZ(A) with
Hence, it provides a tighter bound for dense M than Theorem 15.To support the claim of Conjecture 16 we first construct explicit dual and primal feasible solutions for the case M = −E n for the LP's (12) and (13), respectively.Then optimality is given by showing equality of the objective function values of the primal and dual problem.We further show that the case M = −E n leads to the same objective function value as M = −mm T for any m ∈ {−1, 1} n .Finally, we provide numerical evidence that the conjecture holds for n ≤ 8.
For practical purposes it is important to keep in mind that the platform given J matrix might also scale with the number of qubits resulting in a qubit dependent constant ∥A⊘J n ∥ ℓ∞ = ∥M n ∥ ℓ∞ .

Explicit solutions for M = −mm T
The following lemma will be used in the proof of the explicit feasible solution of the dual problem for M being of the form M = −mm T .We can identify m ∈ {−1, 1} n with b ∈ F n 2 via m = (−1) b as explained in Section 2.

Lemma 17. It holds that
Lemma 18 (explicit dual feasible solution).Let M = −E n , then there is an explicit feasible solution y to the dual LP (13) with Proof.We assume that y = y 1 = y 2 = . . . .Therefore, it suffices to show that From Lemma 17 we know that y = 1/ min(P b )1 satisfies the constraint in Eq. (48).The minimum min(P b ) = −⌊n/2⌋ is reached for |b| = ⌈n/2⌉ or |b| = ⌊n/2⌋ which can be verified from the expression of P b in Lemma 17.Thus, we obtain y = −1/⌊n/2⌋1.The objective function evaluates to For the construction of a feasible solution to the primal problem we first require the following result.

Lemma 19. Let k < n be natural numbers. Then b∈F
for all i, j ∈ [n] with i ̸ = j.
Proof.Consider the case n = 4 and k = 2.We get 4 2 = 6 binary vectors with |b| = 2.It can be easily verified that |b|=2 |b i ⊕ b j | = 4 = 2 2  1 .Now, we assume that for a given n and k < n the Eq. ( 50) holds.It suffices to verify Eq. (50) for i ≤ n and j = n + 1.We fix k, define n ′ := n + 1 and take a b ∈ F n ′ 2 .We have n+1 k binary vectors with |b| = k.For the case b n+1 = 0 we have n k such vectors and for all i ≤ n.There are n−1 k−1 different ways in placing k − 1 1's.For the case b n+1 = 1 we have n k−1 such vectors and for all i ≤ n.There are n−1 k−1 different ways in placing k − 1 0's.Combining the two cases b n+1 = 0 and b n+1 = 1 we obtain for all i ≤ n.
We motivate the next lemma with the result of the explicit dual feasible solution for M = −E n from Lemma 18 and the complementary slackness condition.The complementary slackness condition for a linear program states that if the i-th inequality of the dual problem is a strict inequality for a feasible solution y, then the i-th component of a feasible solution of the primal problem λ is zero: We use this in the following lemma to construct a feasible solution for the primal LP (12).

Lemma 20 (explicit primal feasible solution)
. Let M = −E n , then there is an explicit feasible solution λ to the primal LP (12) with Proof.For this proof we define We only consider binary vectors of the set S := {b ∈ F n 2 | |b| = ⌈n/2⌉, ⌊n/2⌋ and b n = 0}.This set is motivated by the complementary slackness condition and Lemma 18.It can be calculated that |S| = k ⌊k/2⌋ using the recurrence relation for the binomial coefficients.We show that b∈S for a constant D n > 0, which we calculate later.If not explicitly stated, all equations in this proof containing i, j hold for all i, j ∈  58) is equivalent to counting the occurrences of "1" in the sum where we used Lemma 19, the recurrence relation for the binomial coefficients and ⌈n/2⌉ − 1 = ⌊n/2⌋.Counting the occurrences of "+1" in the sum of Eq. (58) yields where we used n ⌊n/2⌋+1 = n ⌊n/2⌋ for odd n.We now evaluate Eq. ( 58) for odd n The case for even n follows the same steps as for odd n, resulting in Equations ( 62) and (63) show that D n = 1/k k ⌊k/2⌋ and that Eq. ( 58) holds.
the objective function value is From the equality of the objective functions for the primal and dual problem from Lemma 20 and Lemma 18 respectively we know that the proposed dual/primal feasible solutions for M = −E n are in fact optimal solutions.Now, we show that the GZZ gate time with M = −mm T for any m ∈ {−1, +1} n is the same as for the case M = −E n .with y = 1/⌊n/2⌋ as in the proof of Lemma 18 yields an optimal solution to the dual LP (13) for the case M = −(mm T ).
Note that, trivially, the lower bound One possibility to prove Conjecture 16 is to show, that the matrix M = −E n maximizes the value of the LP (12) among all matrices M ∈ Sym 0 ([−1, +1] n ).To this end, consider the LP ⟨M, y⟩ = max according to ∥x∥ ℓ1 = max ∥p∥ ℓ∞ ≤1 p T x.Therefore, the optimal objective value of LP (67) is an upper bound on all optimal objective values of LP (12).Clearly, the constructed solution in Lemma 18 is feasible for LP (67).Unfortunately, proving that this constructed solution is optimal is quite challenging, as we discuss in Appendix A.  For each line we let the LP's run for a fixed time.The black dashed line is the upper bound for the original LP (12) from Conjecture 16.The reddish lines show the total GZZ times for the restricted LP j (41) for j = 2, 3, 4. The blueish lines show the total GZZ times for the restricted LP j (41) using Hadamard matrices of dimension d = 4 k to construct the restricted linear operator V [j] .Left: Average case scaling of the total GZZ times.The data points and error bars show the mean and the standard deviation over 100 uniformly sampled matrices Aij = Aji ∈ [−1, 1] for i < j.Right: Conjectured worst-case scaling of the total GZZ times.The data points and error bars show the mean and the standard deviation over 100 binary matrices A = −mm T with uniformly sampled encodings m ∈ {−1, 1} n .j > 2 reduces the total GZZ time significantly while still maintaining a short runtime.The total GZZ time obtained from the heuristic also seems to scale linear with the number of qubits although with a different scaling constant.As mentioned in Sections 5 and 6 the size of the restricted LP j (41), and therefore the runtime, can be further reduced.This reduction is achieved by using Hadamard matrices of dimension d = 4 k with k ∈ N instead of Sylvester-Hadamard matrices of dimension d = 2 k in the construction of the restricted linear operator V [j] .This is based on the Hadamard conjecture, which is known to be true for d ≤ 668 [22,23].Surprisingly, using these Hadamard matrices with d = 4 k not only yields shorter runtime of the heuristic algorithm but also a significant reduction of the total GZZ time compared to the original restricted LP j (41).
Our numerical results show that the heuristic algorithm approximates well the optimal total GZZ time, while maintaining a short runtime.This holds true for both, the average-case and the conjectured worst-case scaling.Therefore, we hope that this heuristic will prove to be an important tool to implement fast GZZ gates in practice.

Conclusion
We investigated the time-optimal multi-qubit gate synthesis introduced in Ref. [6].We show that synthesizing time-optimal multi-qubit gates in our setting is NP-hard.However, we also provide explicit solutions for certain cases with constant gate time and a polynomial-time heuristics to synthesize fast multi-qubit gates.Our numerical simulations suggest that these heuristics provide good approximations to the optimal GZZ gate time.Furthermore, tight bounds on the scaling of the optimal multi-qubit gate times were shown.More precisely, we showed that the optimal multiqubit gate time scales at most as ∥A ⊘ J∥ ℓ1 , the ℓ 1 -norm of the element-wise division of the total and physical coupling matrices A and J, respectively.We also conjectured that the optimal GZZ gate time scales at most linear with the number of qubits.Our results are practical to estimate the execution time of a given circuit, where the entangling gates are implemented as GZZ gates.The execution time is a crucial parameter, in particular, in the NISQ era since it is limiting the length of a gate sequence due to finite coherence time.
It is our hope to proof the conjectured linear scaling of the optimal GZZ gate time in the near future.Moreover, we would like to test and verify our proposed time-optimal multi-qubit gate synthesis methods in an experiment.Depending on the quantum platform we would like to develop adapted error mitigation schemes for the GZZ gates and investigate their robustness against errors.

Appendices A Challenges in proving Conjecture 16
In this section, we want to discuss some obstacles encountered trying to proof Conjecture 16.Conjecture 16 holds, if we show that our constructed solutions from Section 7.1 are optimal solutions for LP (67).Meaning that there is no other feasible solution resulting in a larger objective function value compared to our constructed solution.
We tried an inductive proof which turned out to be intricate due to the additional degrees of freedom in each induction step.Furthermore, we utilized the connection to graph theory from Lemma 2 to transform the LP (67) to a LP over the cut polytope.There, the challenge is the affine mapping between the elliptope and the cut polytope in Eq. ( 16) and Definition 1, which alters the optimization problem crucially.In the following, we discuss another approach based on showing the sufficiency of the Karush-Kuhn-Tucker (KKT) conditions in more detail.

A.1 Concave program
A convex linear program in standard form minimizes a convex objective function over a convex set.But LP (67) maximizes a convex objective function over a convex set.Such optimizations are called concave programs.It is known that the maximum is attained at the extreme points of the polytope V T y ≤ 1 and therefore might have many local optima [29].There are several equivalent sufficient conditions for optimality [30].Here, we investigate one in detail.
which is the same formulation as the original LP (67).

A.2 Dualization
If we can formulate the dual LP to the primal LP (67) and find a feasible solution, then the dual objective function value upper bounds the primal objective function value by weak duality [31].
The standard form of an optimization problem with only linear inequality constraints is minimize f (y) subject to Ay ≤ b. (74) Then, the Lagrange dual function is given by where f * denotes the conjugate function [14].For LP (67) we have f (y) = −∥y∥ ℓ1 (minus sign due to the minimization in the standard optimization form).

A.3 Invexity
The KKT conditions are optimality conditions for non-linear optimization problems.Invexity is a generalization of convexity in the sense that the KKT conditions are necessary and sufficient for optimality [32].By invexity we mean that the objective and constraint functions of the optimization problem are Type 1 invex functions.Let K be the scaling factor of the conjectured upper bound, i.e.
K := n , for odd n , n − 1 , for even n . (79) In the case of LP (67) we have S = y ∈ R ( n 2 ) V T y ≤ 1 and want to show that y * = −1/⌊n/2⌋ is a global optimum.Furthermore, we have for all y ∈ S. It is quite challenging to find a η(y, y * ) ∈ R m satisfying both inequalities.In particular, ansätze motivated from the geometry in small dimensions eventually fail for larger n.

n
scales as O(n j+1 ).Proof.Note, that |C (j) | = n j since there are j duplicate columns in H d×n r .The binomial coefficient can be bounded by n j ≤ n j /j!.Since there are d = 2 ⌈log 2 (n−j+1)⌉ < 2(n − j + 1) rows of H d×n r we have a rough upper bound of the number different encodings generating the restricted elliptope, E(j) n ≤ d n j < 2(n − j + 1) n j < 2n j+1 /j!.The first inequality is due to possible redundant encodings in the definition of E (j) n .Similarly, we can upper bound E

n
for any 2 ≤ j ≤ n and cone(E(2) n ) = Sym 0 (R n ) (Theorem14).The time and space complexity of Algorithm 3 scales polynomially in n as shown in Proposition 13.

Theorem 21 .(− 1 )(− 1 )
If A ⊘ J =: M = −C(mm T ) for any m ∈ {−1, +1} n and C ≥ 0, then the optimal gate time of GZZ(A) is 1 T λ * = ∥M ∥ ℓ∞ n , for odd n , n − 1 , for even n .(65) Proof.The statement has been shown for the case M = −E n by constructing an explicit solution.It is left to show that the cases M = −C(mm T ) for a constant C ≥ 0 yield the same objective function value.Since the objective function and the constraints are linear we can w.l.o.g.assume C = 1.It is clear, that sign(M ij ) = sign(−m i m j ) = −(−1) ci⊕cj for all i < j, with m = (−1) c .Let y = −y sign(v(M )) for a y ∈ R, then each constraint of V T y ≤ 1 of the dual LP (13) reads as y i<j bi⊕bj ⊕ci⊕cj = y i<j bi⊕ bj ≤ 1 , (66) with b := b ⊕ c element wise.Consider the ordered set of all b ∈ F n 2 with b n = 0, then b is just a permutation of that set.Due to the permutation symmetry of the qubits the optimal value of the LP (12) for any M = −(mm T ) is the same as for the case M = −E n .Setting y = −y sign(v(M ))

Figure 2 :
Figure2: Comparing the performance of the original (optimal) LP(12) and the restricted LP j (41) for j = 2, 3, 4.For each line we let the LP's run for a fixed time.The black dashed line is the upper bound for the original LP (12) from Conjecture 16.The reddish lines show the total GZZ times for the restricted LP j (41) for j = 2, 3, 4. The blueish lines show the total GZZ times for the restricted LP j (41) using Hadamard matrices of dimension d = 4 k to construct the restricted linear operator V[j] .Left: Average case scaling of the total GZZ times.The data points and error bars show the mean and the standard deviation over 100 uniformly sampled matrices Aij = Aji ∈ [−1, 1] for i < j.Right: Conjectured worst-case scaling of the total GZZ times.The data points and error bars show the mean and the standard deviation over 100 binary matrices A = −mm T with uniformly sampled encodings m ∈ {−1, 1} n .