Binary Control Pulse Optimization for Quantum Systems

Quantum control aims to manipulate quantum systems toward specific quantum states or desired operations. Designing highly accurate and effective control steps is vitally important to various quantum applications, including energy minimization and circuit compilation. In this paper we focus on discrete binary quantum control problems and apply different optimization algorithms and techniques to improve computational efficiency and solution quality. Specifically, we develop a generic model and extend it in several ways. We introduce a squared $L_2$-penalty function to handle additional side constraints, to model requirements such as allowing at most one control to be active. We introduce a total variation (TV) regularizer to reduce the number of switches in the control. We modify the popular gradient ascent pulse engineering (GRAPE) algorithm, develop a new alternating direction method of multipliers (ADMM) algorithm to solve the continuous relaxation of the penalized model, and then apply rounding techniques to obtain binary control solutions. We propose a modified trust-region method to further improve the solutions. Our algorithms can obtain high-quality control results, as demonstrated by numerical studies on diverse quantum control examples.


Introduction
Quantum control [1][2][3] is a rich field that started with applications in quantum chemistry [4][5][6][7][8] but has since expanded to other subfields such as atomic, molecular, and optical physics [9][10][11]. Quantum control theory has been used in the quantum information community for a long time, mostly for designing pulse controls and gates in quantum devices [12][13][14][15][16][17][18][19][20]; but more recently, researchers have considered using control theories and optimization for the high-level design of quantum algorithms [21][22][23][24][25][26][27]. In this paper, we focus primarily on the quantum information applications of control theory, but our methods are generalizable to other settings. We do not focus on the analytic aspects of quantum control but on the more mechanistic aspects of how to apply optimization techniques to the problem of designing and finding solutions to a variety of quantum control problems. Quantum computing architecture falls into different categories, such as quantum circuits, which utilize discrete unitary gates to control a system, and quantum annealing [28,29], which uses smooth Hamiltonian evolution instead. Quantum control is already familiar within quantum circuit architecture design as a method for designing quantum gates, and several of our examples described in Section 2 belong to this category. Within the framework of quantum algorithms, quantum control is often used in variational quantum algorithms, which can straddle the line between discrete gates and smooth evolution. This paper focuses mainly on discrete binary control, which is more applicable to a quantum gate architecture. Variational techniques have already been implemented experimentally (see, e.g., [30,31]), and our techniques can potentially enhance the performance of the classical variational loop on such near-term devices.
Most quantum optimal control algorithms are based on gradient descent for better convergence than using gradient-free algorithms [32]. Khaneja et al. [9] proposed the gradient ascent pulse engineering (GRAPE) algorithm for designing pulse sequences in nuclear magnetic resonance. The authors approximated the control function by a piecewise-constant function and evaluated the explicit derivative. Based on the GRAPE algorithm, Larocca and Wisniacki [33] proposed a K-GRAPE algorithm by using a Krylov subspace to estimate quantum states during the process of time evolution. Brady et al. [25] applied an analytical framework based on gradient descent to discuss the optimal control procedure of a special control problem that minimizes the energy of a quantum state with a combination of two Hamiltonians. Another class of quantum optimal control algorithms is based on the chopped random basis (CRAB) optimal control technique, which describes the control space by a series of basis functions and optimizes the corresponding coefficients [34,35]. Sørensen et al. [36] combined the GRAPE algorithm and CRAB algorithm to achieve better results and faster convergence. However, all these algorithms are designed for unconstrained continuous quantum control problems.
In this work, we focus on a quantum pulse optimization problem having binary control variables and restricted feasible regions derived by linear constraints. These constraints describe a so-called bang-bang control in a model that corresponds to a quantum circuit design similar to that in the quantum approximate optimization algorithm (QAOA) [37] and other variational quantum algorithms [38,39]. Some literature formulates the QAOA algorithm into bang-bang control problems with a fixed evolution time and investigates the performance of multiple methods including Stochastic Descent and Pontryagin's minimum principle for optimizing control models [40,41]. However, they only consider quantum systems with two controllers, while we propose a more general solution framework for quantum systems with multiple controllers. In this paper, we introduce four quantum control examples: (i) energy minimization problem, (ii) controlled NOT (CNOT) gate estimation problem, (iii) NOT gate estimation problem with the energy leakage, and (iv) circuit compilation problem. The nonconvexity, binary variables, and restricted feasible sets in these examples lead to extreme challenges and difficulties in solving the related binary quantum control problems.
Methods for binary control mainly include genetic algorithms [42], branch-and-bound [43,44], and local search [45]. To the best of our knowledge, genetic algorithms have not been applied to binary quantum control, but Zahedinejad et al. [46] showed that such algorithms fail to obtain high-quality solutions even to continuous quantum control problems. Branch-and-bound can find high-quality solutions for binary optimal control, but the long computational time makes the algorithm intractable and hard to use for large-scale practical problems. Vogt and Petersson [45] introduced a trust-region method [32] to solve the single flux quantum control problem with binary controls. We extend their approach by allowing the addition of constraints, such as min-up-time and max-switching constraints to control the number of switches.
The main contributions of this paper are as follows. First, we propose a generic model including continuous and discretized versions for binary quantum control problems with linear constraints indicating that at each time step there can be only one active control. Second, we introduce an exact penalty function for linear constraints and develop an algorithmic framework combining the GRAPE algorithm and rounding techniques to solve it. Third, to prevent chattering on controls, we propose a new model that includes a total variation (TV) regularizer, and then we propose a new approach based on the alternating direction method of multipliers (ADMM) to solve this model. Fourth, we introduce a modified trust-region method to improve the solutions obtained from these algorithms. Compared with other methods, our algorithmic framework can obtain high-quality binary control sequences and prevent frequent switches. We demonstrate the performance on multiple quantum control examples with various objective functions and controllers.
The paper is organized as follows. In Section 2 we construct generic models of quantum binary control problems using continuous and discretized formulations. We also introduce the corresponding specific formulations for four quantum control examples. In Section 3 we review the GRAPE algorithm, utilize a penalty function to relax certain linear constraints, and introduce our rounding techniques. In Section 4 we propose and solve the penalty model with the TV regularizer and the corresponding ADMM algorithm. In Section 5 we derive trust-region subproblems and propose an approximate local-branching method based on the trust-region algorithm. In Section 6 we present numerical simulations for multiple instances of the three quantum control examples in Section 2 to demonstrate the computational efficacy and solution performance of our models and algorithms. In Section 7 we summarize our work and propose future research directions.

Formulations for Quantum Control
Consider a quantum system with q qubits and a time horizon [0, t f ], where t f > 0 is defined as the evolution time. Let H (0) ∈ C 2 q ×2 q be the intrinsic Hamiltonian. Let N be the number of control operators and H (j) ∈ C 2 q ×2 q , j = 1, . . . , N be the given control Hamiltonians. Let X init , X targ ∈ C 2 q ×2 q be the initial and target unitary operators of the quantum system, respectively. Define variables u j (t) ∈ {0, 1} , j = 1, . . . , N, ∀t as the real-valued control functions at time t, and use u to represent the corresponding vector form. Define variables H(t) ∈ C 2 q ×2 q as the time-dependent control Hamiltonian function and X(t) ∈ C 2 q ×2 q as the time-dependent unitary quantum operator function. A generic quantum control problem for optimizing a function of quantum operators is formulated as The objective function (1a) is a general function with respect to the final unitary operator X(t f ). We assume that F is a differentiable function. In Sections 2.1-2.4, we provide the specific formulations of the objective function for four quantum control examples. Constraint (1b) describes the computation of the time-dependent Hamiltonian function based on the intrinsic Hamiltonian, control Hamiltonians, and control functions. Constraint (1c) is the ordinary differential equation describing the time evolution of the operator of the quantum system with the same units for energy and frequency, setting = 1. Constraint (1d) is the initial condition of the operator. Constraint (1e) indicates that the summation of all the control functions should be one at all times, which is described by the Special Ordered Set of Type 1 (SOS1) property in optimal control theory [47]. In binary control, this SOS1 property guarantees that only one control field will be active at a time that mimics the bang-bang nature of some quantum applications such as the quantum approximate optimization algorithm and the variational quantum eigensolver (VQE). Constraints (1f) require all time-based values of the control functions to be feasible. Following the time discretization process in [9], we explicitly integrate constraint (1c). In particular, we divide the time horizon [0, We consider time intervals with an equal length represented by ∆t, but our work readily extends to nonuniform discretizations. For each controller j = 1, . . . , N and each time step k = 1, . . . , T , we define discretized binary control variables as u jk ∈ {0, 1}. For each time step k = 1, . . . , T , we define the discretized time-dependent Hamiltonians as H k ∈ C 2 q ×2 q and the quantum operators as X k ∈ C 2 q ×2 q . The differential equation (1c) is thus approximated by For each k = 1, . . . , T , the linear differential equation (2) is a Schrödinger equation of a unitary operator, and we obtain an explicit solution as because H k is time independent. We then obtain a discretized quantum control problem (DQCP) as a discretization of the differential control model (1) using explicit solutions on each interval [t k−1 , t k ) for unitary operators as The objective function (3a) is the objective function (1a) evaluated at the approximated final operator X T ≈ X(t f ). Constraints (3b)-(3f) are the discretized formulations of constraints (1b)-(1f). In the following sections we use this discretized formulation to develop our algorithms. We present the following discussion on the relationship between the continuous and discretized formulation. In our paper we consider four examples with different objective functions and control Hamiltonians. In the following sections we introduce the continuous model of four example problems in quantum control. They can be formulated as discretized models (DQCP) following the above discretization process.

Energy Minimization Problem
Consider a quantum spin system consisting of q qubits, no intrinsic Hamiltonian, and two control Hamiltonians H (1) , H (2) . The initial operator X init is a 2 q -dimensional identity matrix. Let |ψ 0 be the ground state of the first control Hamiltonian H (1) . The ground state energy of this quantum system is the minimum eigenvalue of the control Hamiltonian H (2) , represented by E min in our model. Since the ground state energy differs across problem instances, it is important to compare performance consistently across these instances. Therefore, we maximize the approximation ratio, defined as the achieved energy divided by the true ground state energy. The problems we study are well defined so that the initial state of the system has zero energy with respect to H (2) ; furthermore, the ground state energy will always be a large negative number. Therefore, the approximation ratio reflects the energy level achieved by our procedure from the initial energy of zero to the true ground state energy. Because our procedure can improve the state relative only to its initial configuration, a negative approximation ratio is not possible.
To ensure consistency with the objective functions of our other two examples, we minimize one minus the approximation ratio. Because the true ground state energy is negative, this formula is equivalent to minimizing the achieved energy. The continuoustime formulation is where · † represents the conjugate transpose of a matrix and ·| represents the conjugate transpose of a quantum state vector |· . The matrix [J ij ], i, j = 1, . . . , q is the adjacency matrix of a randomly generated graph with q nodes, and σ x i , σ z i are Pauli matrices of qubit i.

CNOT Gate Estimation Problem
Our second example is defined on an isotropic Heisenberg spin-1/2 chain with length 2 according to [48]. For our study we consider just the unitary evolution with no energy leakage. The quantum system includes 2 qubits, an intrinsic Hamiltonian H (0) , and two control Hamiltonians H (1) , H (2) . The initial operator X init is a 4-dimensional identity matrix and the target operator is the matrix formulation of the CNOT gate represented by X CNOT ∈ C 4×4 . This problem's continuous-time formulation is where σ x i , σ y i , σ z i , i = 1, 2 are Pauli matrices of two qubits. The goal is to minimize the objective function, defined as the infidelity between the final operator and the target operator.

NOT Gate Estimation with Energy Leakage
We consider an anharmonic multilevel system that is widely used in quantum experiments to implement physical qubits. Following [49], we consider the lowest three levels of an energy spectrum with only nearest-level coupling for a single qubit q = 1. The first two levels comprise the qubit and the third level models the energy leakage. Under this setting, the system contains an intrinsic Hamiltonian H (0) and two control Hamiltonians H (1) , H (2) . The initial operator X init is a 3-dimensional identity matrix, and the target operator is the matrix formulation of the NOT gate corresponding to the lowest three levels of the energy spectrum represented by X NOT ∈ C 3×3 , which is The continuous-time formulation of this problem is where |j represents a vector with the j-th element as 1 and other elements as 0. The parameters µ 1 , µ 2 weigh the relative strength of transitions and the parameters ω 1 , ω 2 correspond to the drive frequency. We present the specific values of the parameters used in numerical simulations in Section 6.2.

Circuit Compilation Problem
For a general quantum algorithm, each quantum circuit needs to be compiled to a representation imposed by specific controllers and constraints in order to execute the algorithm on specific quantum devices. We take quantum circuits generated by the unitary coupledcluster single-double (UCCSD) method [50,51] for estimating the ground state energy of molecules in quantum chemistry as examples. We consider the Hamiltonian controllers in the gmon-qubit system [52] with q qubits. Each qubit has a flux-drive controller and a charge-drive controller. In addition, there is a rectangular-grid topology with nearestneighbor connectivity denoted by E, and each pair of connected qubits (j 1 , j 2 ) ∈ E has a controller. Define w j 1 j 2 (t) as control functions for the connected qubits controllers and use w(t) to represent the corresponding vector form of w j 1 j 2 (t). The initial operator X init is a 2 q -dimensional identity matrix. The target operator is the matrix formulation of the quantum circuit generated by UCCSD for specific molecules represented by X UCCSD ∈ C 2 q ×2 q . We refer interested readers to Gokhale et al. [53] for more details. The continuous-time formulation of this problem is where J c , J f , J e are parameters corresponding to the quantum system and σ x j is the Pauli matrix of qubit j. We finish this section by showing a general property of the objective function in optimal control problems. Theorem 1. For a quantum system with q qubits, the infidelity 1− 1 2 q tr X † targ X(t f ) ∈ [0, 1]. The objective functions of the optimal control models for CNOT gate (6) and circuit compilation problem (9) are bounded between zero and one.
Proof. By definition, the objective function value cannot be larger than 1. From the definition of our quantum control problem, the quantum operators X targ and X(t f ) are both unitary matrices. Therefore, we can write The first inequality follows from the fact that the trace of a product is invariant under cyclic permutations of the factors. The second inequality follows from X † targ x i ∞ ≤ 1. As a result, the objective function is no less than 0, and this completes the proof. [54] that consists of control problems on Lie groups. Specifically, this class consists of all control problems described by a Hamiltonian of the form H(t) = H (0) + j u j (t)H (j) . Controllability results on this class of control problems are well known [54,55]. A test of controllability of these problems consists of examining the dynamical Lie algebra generated by the Hamiltonians, H (j) , and their nested commutators and comparing that Lie algebra with the Lie algebra of the subgroup defined by the symmetries in a given problem. For the systems considered in this paper, there are indeed enough control Hamiltonians to reach the target state by using piecewise continuous control functions, u j (t).

Binary Quantum Control Algorithms
In this section we propose models and algorithms to obtain binary quantum control results for the discretized model (DQCP). In Section 3.1 we review the adjoint approximation method of gradients used in the GRAPE algorithm for solving the continuous relaxation of the optimal control problem (DQCP) and derive the gradient update rule for the aforementioned four specific examples. In Section 3.2 we propose an exact penalty function for the SOS1 property that allows us to add constraints to GRAPE. In Section 3.3 we introduce rounding techniques to obtain binary controls. These algorithms form the basis of our discrete framework discussed in Sections 4-5.
The overall framework is outlined in Figure 1, which presents the methods in Sections 3-5 and the corresponding tables and results. We summarize the algorithms of our framework in Table 1. In the figure, the rectangles represent the three formulations of our model, and the ellipses represent the algorithms. The parallelograms show the result tables corresponding to algorithms, where the dashed parallelograms refer to the results of the continuous models and the solid parallelograms refer to the results of the discrete models.

GRAPE Approach for Solving Continuous Relaxations
The GRAPE algorithm [9] is a gradient descent algorithm for solving the unconstrained continuous control problem. We derive a gradient descent algorithm based on the adjoint approach used in the GRAPE algorithm to solve the continuous relaxation of our discretized model (DQCP) without the SOS1 property enforced by constraint (3e). We eliminate variables X and H by converting them into implicit functions of the control variables u using constraints (3b)-(3d). Then the minimization problem of F (X T ) on u, X, H is converted to the minimization problem of F (X T (u)) over the variables u. The goal of the adjoint approach is to approximate the gradient of the objective function F with respect to control variables u jk , j = 1, . . . , N, k = 1, . . . , T based on the discretized Unconstrained model (continuous relaxation of (DQCP) without (3e)) Model with penalized squared L 2 -norm (continuous relaxation of (DQCP-L 2 )) Model with TV regularizer (continuous relaxation of (DQCP-TV)) GRAPE (Section 3.1) Trust-region method (Section 5) Table 4 Sum-up-rounding (no constraints, Section 3.3.1) Integral minimization with min-up-time constraints (Section 3.

3.2)
Integral minimization with maxswitching constraints (Section 3.3.2) Table 6 Approximate Localbranching (Section 5)  time horizon. For simplicity, we define propagators for each time step k = 1, . . . , T as U k = exp {−iH k ∆t}. For small ∆t, the gradient of U k corresponding to u jk is estimated as The gradient of the objective function with respect to the propagators depends on its specific formulation. We discuss gradients of two specific functions for the examples in Section 2.
Energy function For generality, we useH to represent the Hamiltonian in the energy objective function, which means that the goal is to minimize the energy corresponding tō H. Specifically, we haveH = H (2) in the example in Section 2.1. The general energy function can be expressed as For each time step k = 1, . . . , T − 1, define variables |κ k = U † k+1 · · · U † TH X T |ψ 0 . For the last time step T , we define a variable |κ T =HX T |ψ 0 . Then the gradient with respect to u jk is computed as Infidelity function The infidelity function can be expressed as For each time step k = 1, . . . , For the last time step T , we define a variable λ T = X targ . Then the gradient of the trace with respect to u jk is computed as Using the definition of objective function F , we compute the gradient as where arg(·) represents the argument of a complex number. With these computed gradients, we minimize the reduced function F (X T (u)) with respect to control variables u ∈ [0, 1] N ·T by L-BFGS-B [56], a well-known quasi-Newton algorithm for solving bound-constrained optimization problems.

Remark 4.
Our implementation of GRAPE extends the classical gradient-ascend scheme (see, e.g., [9,33]) in a number of ways: We use a bound-constrained quasi-Newton method to solve problems with bounded control variables, and we introduce a penalized GRAPE (denoted by pGRAPE) that adds a quadratic penalty of the SOS1 property in Section 3.2.
We take the circuit compilation problem (Section 2.4) on the molecule H 2 (dihydrogen) as an example to show the control results, and we present the continuous control results without the SOS1 property obtained by the GRAPE algorithm in Figure 2a. We define the absolute violation of the SOS1 property at each time step k = 1, . . . , T as | N j=1 u jk − 1| and present the value at each time step in Figure 2b. The figure shows that the results violate the SOS1 property at all time steps, with the maximum absolute violation being 2.404. Hence we introduce a penalized function in Section 3.2 to impose the SOS1 property.

Penalty Function for SOS1 Property
Our GRAPE algorithm is defined only for unconstrained or bound-constrained problems [9]. Hence, we convert the SOS1 property (3e) into a squared L 2 -penalty term and add it to the objective function. Define the squared L 2 function of the constraint violation as  We let a constant ρ > 0 be the penalty parameter. Then the model with the penalty function is We use l(u * ρ , T ) to denote the optimized value of the squared L 2 function l(u, T ) under the penalty parameter ρ. The following theorem discusses the exactness of the penalized objective function.
A more general version of Theorem 2 is proved in [57]. Next we discuss the value of the optimized squared L 2 -penalty term with respect to the penalty parameter ρ for the continuous relaxation of (DQCP-L 2 ).
Remark 5. The GRAPE algorithm is still able to solve the unconstrained continuous relaxation with the penalty function (DQCP-L 2 ). We propose a penalized version (pGRAPE) in Section 3.1 by adding a term 2ρ N j=1 u jk − 1 to the approximated gradient of the objective function.

Remark 6.
For the quantum control problem with two controllers, the SOS1 can be enforced directly by substituting in the model instead of using the penalty function.
We show the control results obtained from the continuous relaxation with the squared L 2 -penalty function and penalty parameter ρ = 1.0 in Figure 3. The value of the squared penalty term is 5.55 × 10 −7 , leading to a small error between continuous and binary results with the SOS1 property obtained by rounding techniques. However, we still need to provide rounding techniques to obtain binary results.

Rounding Techniques and Optimality Guarantees
In this section we introduce rounding techniques to obtain binary control results and investigate their optimality guarantees. In Section 3.3.1 we review the sum-up rounding technique designed for binary controls without constraints and discuss the difference compared with the continuous results. In Section 3.3.2 we introduce the integral minimization problem for rounding to obtain restricted binary controls.

Sum-Up Rounding
The sum-Up rounding (SUR) strategy proposed by Sager et al. [47] is a well-known method to obtain integer controls from continuous, relaxed ones in optimal control theory; see, for example, [58,59]. One concern when applying SUR is that the solution of the relaxation does not satisfy the SOS1 constraints because the exactness of the squared L 2 -penalty is only for binary controls. We show that as long as the violation of the SOS1 constraint is small, the solution that is constructed via SUR still satisfies the strong convergence properties of SUR. We define the discretized continuous control function u c as u c jk ∈ [0, 1], j = 1, . . . , N, k = 1, . . . , T . We define the discretized binary control function u b as u b jk ∈ {0, 1} , j = 1, . . . , N, k = 1, . . . , T . For each time step k = 1, . . . , T , we denote the vector form of control variables as u c k and u b k . We define the current cumulative deviation between continuous and binary controls asp jk , j = 1, . . . , N, k = 1, . . . , T . The SUR approach to obtain binary controls is described in Algorithm 1.
The construction of u b jk ensures that the binary control satisfies the SOS1 property. Sager et al. [47] proved that if the continuous control u c satisfies the SOS1 property, then the rounded control u b will converge to u c when the length of a time interval ∆t converges to zero. This does not hold for arbitrary continuous controls without the SOS1 property, however, as the next proposition shows.
Choose controller j * = arg max j=1,...,Npjk . If there is a tie, we break the tie by choosing the smallest index. Set binary control u b j * k = 1 and u b jk = 0, ∀j = j * . Output: Binary control u b .

Proposition 1. Let continuous control u c and binary control
The proof is provided in Appendix A.2. We have the following estimate of the difference between the continuous control without the SOS1 property and the rounded control.

Theorem 4.
With the same definition of (∆t) in Proposition 1, it holds that for any Because The proof of the inequality (21) extends the proof for Theorem 5 in [47]. We modify the right-hand side of the inequality by adding a term 2N −1 N (∆t). We provide the detailed proof in Appendix A.2. In the following corollary we show that for our bounded discretized quantum control problem, (∆t) converges to zero as ∆t converges to zero, ensuring the convergence of the rounded control to the continuous control.
where t f is the evolution time. Furthermore, if the original objective function F is bounded, the rounded control will converge to the continuous control when ∆t is small enough.
The proof is provided in Appendix A.2. Based on the convergence of binary results to continuous results, we present the following proposition to guarantee the optimality of binary results.

Proposition 2.
Under the discretized setting of the quantum control problem, let u c be the continuous control and u b be the binary control obtained by Algorithm 1. Then the state of time evolution with binary control X b converges to the state with continuous control X c at each time step, namely, The proof is provided in Appendix A.2.

Combinatorial Integral Approximation
To obtain discretized binary controls, Sager [60] proposed a more general rounding technique by minimizing the integral difference between continuous and binary controls with certain additional constraints on the binary controls called combinatorial integral approximation (CIA). Let U B ⊆ {0, 1} N ·T be the feasible region for binary controls. For each controller j = 1, . . . , N and each time step k = 1, . . . , T , define u c jk and u b jk as the discretized continuous and binary controls. We use u c and u b to represent the corresponding vector forms. The optimization problem for rounding is formulated as a mixed-integer problem The rounding optimization problem can be solved by diverse algorithms for solving integer programs. In this paper we choose a branch-and-cut algorithm [61]. Furthermore, the rounding result obtained from SUR is an optimal solution of model (24) with U B = {0, 1} N ·T [60]. In practice, researchers and engineers have proposed diverse restrictions on binary controls to avoid frequent switches. We formulate two main types of constraints as linear constraints.

Min-Up-Time Constraints.
Let T minup be the minimum number of steps ∆t that the controller is active. The min-up-time constraints enforce that each controller is active for at least T minup time steps. The restricted feasible region is formulated by the following linear constraints: Max-Switching Constraints. Let S be the maximum number of switches during the evolution time horizon [0, t f ]. The max-switching constraints enforce an upper bound S on the total number of switches for each controller. We describe the restricted feasible region by linear constraints as We present the binary results obtained from SUR and CIA with min-up-time constraints in Figure 4. We show that SUR leads to chattering on singular arcs. Although CIA can prevent frequent switches by setting hard constraints on the controls, it leads to a serious objective value increase. In Section 4 and Section 5 we propose models and algorithms to reduce switches when taking the original objective function into account.

Model with the TV Regularizer
Binary controls that avoid chattering on singular arcs can be obtained by solving the rounding model (24). However, the objective function of the rounding model is not relevant to the original objective function F , leading to a significant increase in the value of F , as illustrated in Figure 4. Here we investigate a TV regularizer as an alternative strategy to obtain solutions with fewer switches. The TV of a function is defined as the integral of the absolute change of the function over the entire space. Since Rudin et al. [62] first introduced the TV regularizer by considering functions describing images, it has become a popular method in image noise reduction; see, for example, [63,64]. We refer interested readers to [65] for a detailed review of applying a TV regularizer to denoise images. Stella et al. [66] propose an algorithm to solve the continuous control problem with the TV regularizer based on the forwardbackward envelope [67]. Sager and Zeile [68] considered the TV of integer control functions to reduce the absolute change of controls. The authors added a constraint imposing that the TV regularizer should be no more than a threshold. However, they considered the TV regularizer only when rounding the control results of continuous relaxation, which leads to an increase in the original objective function. Leyffer and Manns [69] extended the trust-region method to solve the integer optimal control problem with the TV regularizer in the objective function, but it is still highly dependent on the initial values.
We use the TV regularizer to penalize the absolute change between control variables of two consecutive time steps, which is defined as for a control variable vector u. Let α > 0 be the parameter for the TV regularizer. Then the continuous relaxation model with the TV regularizer is formulated as The first term is the original objective function, the second term is the squared L 2 -penalty function for the SOS1 property, and the third term is the TV regularizer function. Because of the L 1 term, we cannot use GRAPE; instead, we propose ADMM to solve the continuous relaxation. For each controller j = 1, . . . , N and time step k = 1, . . . , T − 1, we introduce auxiliary variables v jk = u jk − u jk+1 to describe the change between control variables of two consecutive time steps. We reformulate the model as follows: Define µ jk , j = 1, . . . , N, k = 1, . . . , T − 1 as the dual variables corresponding to constraints (28b). We use u, v, µ to denote the corresponding vector forms. Let fixed parameters β > 0, δ > 0 be the Lagrangian penalty parameter and stopping criterion threshold. The update procedure of ADMM consists of three steps: (i) updating variables u, (ii) updating variables v, and (iii) updating dual variables µ. We solve the minimization problem for updating variables u by the modified GRAPE algorithm with a squared L 2 -penalty function and Lagrangian penalty function. We derive an exact form for the update of variables v. We use gradient descent to update the dual variables µ. The specific procedure for updating is presented in Algorithm 2. We present the continuous control results obtained from the model (DQCP-TV) in Figure 5 for the circuit compilation example on the molecule H 2 . We test the parameter of the TV regularizer α = 10 −5 , 10 −4 , 10 −3 , 10 −2 , 10 −1 and the Lagrangian parameter β = 0.1, 0.5, 1.0. We choose α = 10 −3 and β = 0.5, which has the smallest objective value after rounding with min-up-time constraints. The number of switches in control results decreases significantly as compared with the results in Figure 3, showing the benefits of the TV regularizer.

Algorithm 2: ADMM Algorithm for Solving Continuous Relaxation of (DQCP-TV).
Input: Initial values of variables u 0 , v 0 , µ 0 , and number of ADMM iterations L. Initialize iteration l = 1. while l ≤ L and the algorithm does not converge do Update variables u as

Improvement Heuristic: Approximate Local-Branching Method
Because the ADMM algorithm does not ensure global optimality for nonlinear objective functions and the objective value increases after rounding, the control results can be improved by applying an approximate local-branching (ALB) heuristic. The trust-region method [32] is a widely used local search method in optimization but highly depends on the initial values. In this section we propose a trust-region subproblem for the quantum control problem and then modify the trust-region method to solve the problem starting from the solutions obtained in Sections 3-4 to improve the quantum controls.
We introduce our improvement heuristic based on the approach of [69] to solve the binary model with the TV regularizer. Given a feasible pointû, we use the first-order gradient to approximate the objective value around the pointû and define the trust-region subproblem with radius R as We note that (29a) approximates the objective value F (u) but uses an exact form of the TV regularizer. Constraint (29b) indicates that we consider only the points with L 1 distance toû no more than R. For each controller j = 1, . . . , N and time step k = 1, . . . , T − 1, define variables v jk as the upper bound of the absolute change between the control values of two consecutive time steps. The trust-region subproblem is reformulated as the following mixed-integer linear program: The objective function (30a) is a reformulation of (29a) with variables v. Constraint (30b) is the trust-region constraint. Constraints (30c) ensure that for each controller j = 1, . . . , N , v jk ≥ |u jk − u jk+1 |, k = 1, . . . , T − 1. Letū be an optimal solution of the model (30). Define ∆F p (û,ū) and ∆F a (û,ū) respectively as the predictive and actual decrease of the objective function with the following formulations: The trust-region algorithm consists of an inner loop and an outer loop. In the inner loop we solve a sequence of trust-region subproblems a with monotonically decreasing radius until the ratio between the actual decrease and predictive decrease is large enough. To obtain a balance between the computational cost and the searched area, we set a threshold R for the radius. When the radius is greater than the threshold, we decrease it according to a geometric sequence [69]; otherwise, we decrease it by an arithmetic sequence. In the outer loop we repeat the inner loop for each updated point until the predictive decrease is nonpositive. This procedure is described in Algorithm 3.
Set l ← l + 1. Update the central point u l ←ū. Output: Control results u l .
The trust-region approach can also improve solutions obtained by CIA approaches from Section 3.3.2. Based on the restricted feasible region U B instead of the TV regularizer, we propose the trust-region subproblem with additional linear constraints as follows: The objective function (32a) is a first-order gradient approximation of the original objective function F (u), and constraint (32b) restricts feasible regions for controls. The computations of the predictive decrease and actual decrease are modified as For the trust-region algorithm (Algorithm 3), we replace solving model (30) with solving model (32) and compute the decrease by (33). The other parts remain the same.
In theory, all time-evolution processes in our algorithms can be conducted on quantum computers. The inputs for the quantum computers are the control sequences, and we require the output of the objective function and its (approximate) gradient. The updates of variables are still computed on classical computers.

Numerical Results
We apply our algorithmic framework proposed in Sections 3-5 to diverse instances of the four examples introduced in Section 2. In Section 6.1 we show the results from stateof-the-art optimization solvers as the baseline. In Section 6.2 we introduce the design of numerical instances and parameter settings. In Sections 6.3-6.5 we present the numerical results, including the continuous relaxation results, binary results by combinatorial integral approximation, and binary results after an improvement heuristic.
As a baseline we first apply the optimization solvers from the NEOS server to solve the energy minimization example problem in Section 2.1. All the experiments are conducted on the online server. We set the number of qubits q = 2 and q = 6. We set the evolution time t f = 2 and the number of time steps T = 40. We notice that for all the evolution times in Section 6, we use dimensionless units, with = 1. We apply the solver MUSCOD-II (6.0) [82] to solve the continuous formulation with the differential equations (1), and we set the maximum iterations to 100 for q = 2 and 700 for q = 6. To apply the stateof-the-art nonlinear optimization solvers to a discretized model, we derive a discretized formulation of the ordinary differential equation by the implicit Euler method (see, e.g., [83]) as We apply the solvers SNOPT (7.6.1) and IPOPT (3.13.4) to solve the continuous relaxation model and apply the solvers MINLP, Bonmin (1.8.8), Couenne (0.5.8), and SCIP (7.0. 3.5) to solve the binary model. We set the time limit to 5 minutes when q = 2 and 80 minutes when q = 6. In Table 2 we present the objective values, TV-norm values, CPU times, and explored nodes (only for binary solvers). Table 2 shows that MUSCOD-II is good for solving the continuous relaxation for the instance with q = 2, but it takes a long time to solve the large instance with q = 6. SNOPT and IPOPT perform worse than the MUSCOD-II solver especially when the quantum systems include more qubits. For the binary control problem, all the methods reach the time limit. Bonmin obtains the best solution, but the gaps between their results and true energy are all large. For the instance with q = 6, some optimization solvers such as MINLP and Couenne run out of memory. The CPU time increases and the number of explored nodes decreases significantly when the size of the quantum system increases. Table 2: Results of solvers on energy minimization example. The results are marked by "OOM" if a solver runs out of memory and "LIMIT" if a solver reaches the time or iteration limit. The explored nodes of continuous solvers are marked by "-" because the node exploration process is conducted only by binary solvers. This experiment shows that existing standard nonlinear programming and mixedinteger nonlinear programming solvers cannot solve discretized optimal control problems.
In the following sections we present the numerical results of our proposed algorithms on multiple instances. Our pGRAPE+SUR method obtains binary solutions with an objective value 4.22E−04 in 0.14 seconds for q = 2 and binary solutions with an objective value 0.157 in 27.95 seconds for q = 6. We show that our methods can obtain better results and with shorter computational time than do the state-of-art solvers.

Experimental Design and Parameter Settings
When the problems have no SOS1 property, the squared L 2 term in the penalized model (DQCP-L 2 ) is eliminated, so solving (DQCP-L 2 ) by pGRAPE is equivalent to solving the original model (DQCP) by GRAPE. Because the energy minimization problem has only two controllers, we eliminate the SOS1 property directly by substituting u 2 (t) = 1 − u 1 (t). We remove the SOS1 property in the CNOT gate estimation problem to show the generality of our models. We solve the model with a squared L 2 -penalty function for the circuit compilation problem.
For all the examples, we apply the pGRAPE algorithm to solve the continuous relaxation. We also employ the TR and ADMM algorithms to solve the continuous relaxation with the TV regularizer. Then we obtain the binary results without constraints by SUR, the binary results with min-up-time constraints, and the binary results with maxswitching constraints by integral minimization. Furthermore, we apply the approximate local-branching methods to improve the binary solutions with the TV regularizer and hard control constraints.
For the energy minimization problem, when q = 2, the matrix J is two-dimensional and includes only one independent element because it is symmetric. Hence, we set diagonals in J to 0 and other elements to 1. When q = 4, 6, we generate 5 instances where each instance has a random symmetric matrix J with zero diagonals and elements within a range [−1, 1]. We will present averaged objective value results for these problems. The instances are represented by Energy2, Energy4, and Energy6 corresponding to the number of qubits q = 2, 4, 6 in our following results. For the CNOT problem, we conduct experiments with evolution times t f = 5, 10, 15, 20 and represent the corresponding instances as CNOT5, CNOT10, CNOT15, CNOT20, respectively. For the NOT gate estimation problem, we follow the parameter settings in [49] and set µ 1 = 0, µ 2 = 2π, ω 1 = 1, ω 2 = √ 2. We conduct numerical simulations with evolution times t f = 2, 6, 10 and represent the instances as NOT2, NOT6, and NOT10, respectively. For the circuit compilation problem, we test two molecules, H 2 (dihydrogen) and LiH (lithium hydride), and generate the UCCSD circuits with minimum energy by the VQE algorithm in the Python package Qiskit [84]. The instances are represented by CircuitH2 and CircuitLiH. We set the parameters corresponding to quantum systems as J c = 0.2π, J f = 3π, J = 0.1π. We test values of the penalty parameter for the squared L 2 -penalty function ρ = 10 −6 , 10 −5 , 10 −4 , 10 −3 , 10 −2 , 10 −1 , 1, 10, and we choose the one with the smallest objective value among the parameters with a squared penalized term less than 10 −6 . We set the penalty parameter to be ρ = 1.0 and ρ = 0.1 for two the instances, respectively. We test values for the TV parameter α = 10 −5 , 10 −4 , 10 −3 , 10 −2 and choose α with the smallest rounding objective value with min-up-time constraints. The settings of α and other parameters are presented in Table 3. All the numerical simulations were conducted in Python 3.8 on a MacOS computer with 8 cores, 16 GB of RAM, and a 3.20 GHz processor. All the computational time results are the times on classical computers.
In Sections 6.3-6.5, for brevity we select four different instances, Energy6, CNOT20, NOT10, and CircuitLiH, to analyze the results of objective values and CPU time. Detailed results of all the methods and instances are presented in Tables 4, 5   Instance

Results of Continuous Relaxations
In Figure 6 we show how the common logarithm of the squared L 2 -norm value varies with the penalty parameter represented by the lines. We also present an asymptotic bound   We show that pGRAPE always obtains the lowest objective value but the highest TV regularizer values because it solves the model without the TV regularizer. ADMM has the best performance for reducing the TV regularizer values, showing the benefits of introducing a TV regularizer in Section 4. The detailed results are presented in Appendix B.
We measure the problem size of each instance by 2 q × N × T , where q, N , and T represent the number of qubits, number of controllers, and number of time steps. For four groups of instances, Energy, CNOT, NOT, and Circuit, we present in Figure 7 common logarithm-logarithm figures to show how the CPU time and the number of iterations vary with the problem size. The CPU time increases significantly with an increase in the number of qubits because the dimension of simulation Hamiltonian matrices increases exponentially. The CPU time also increases when the number of controllers and time steps increases. Compared with pGRAPE, TR and ADMM spend more time solving the model because it contains the TV regularizer. The CPU time of ADMM is more than that of TR for Energy instances, while TR spends more time on CNOT, NOT, and Circuit instances than ADMM spends.

Results of Rounding Techniques
The results for Sections 6.4 and 6.5 are summarized in Figure 12, which shows the objective values and TV regularizer values of the binary results obtained by SUR as well as CIA with min-up-time constraints and max-switching constraints of selected instances. Compared with pGRAPE+MT/MS, TR+MT/MS or ADMM+MT/MS obtains lower objective values and TV regularizer values for the binary results with min-up-time constraints and max-switching constraints because they solve the model with the TV regularizer.
From the results of pGRAPE we show that the squared L 2 -penalty function ensures that the difference between binary controls of SUR and continuous results is small. To further demonstrate the convergence of pGRAPE+SUR in Section 3.3.1, we present how the maximum absolute integral error, which is the left-hand side of equation (21), varies with the number of time steps in Figure 8 represented by the lines. The figures show that the error converges to zero with the increase in the number of time steps, as is claimed in Proposition 2. We also present the theoretical upper bound of the integral error, which is the right-hand side of (21), represented by the dashed lines. We observe that the integral error is always less than the upper bound, demonstrating the conclusion in Theorem 4. In addition, we present the figures of how (∆t) and the upper bound t f l(u c , T )∆t vary with the number of time steps for instances CircuitH2 and CircuitLiH in Figure 9. We show that (∆t) is always smaller than the upper bound, demonstrating the conclusion (22) in Corollary 1.  (a) Instance CicruitH2 (b) Instance CircuitLiH Figure 9: Binary logarithm of the maximum SOS1 difference of continuous control (∆t) and its upper bound in (22). Blue lines represent (∆t) and orange dashed lines represent the upper bound. We show that (∆t) converges to 0 with the increase in time steps and is always smaller than the upper bound, demonstrating Corollary 1.
We set the time limit to 60 seconds for solving the combinatorial integral approximation with min-up-time and max-switching constraints. In Figure 10 we present the CPU time and iterations of all the instances. All the CPU times of SUR and Energy instances are less than 0.01 seconds so we do not show them in the figure. The CPU times increase significantly with the number of variables. For the instance CircuitLiH, all the methods exceed the computational time limit. In most cases, iterations and CPU times of MS are more than MT because the problem with MS has a larger integrality gap.

Results of Improvement Heuristic
The objective values and TV regularizer values after the improvement heuristic with the control results of combinatorial integral approximation as initial points for selected instances are represented by opaque small dots in Figure 12. Because there are multiple local minima for these quantum control problems, each method may obtain a different performance after the improvement heuristic. For systems with two controllers, all the methods obtain performance at a similar level. The performance of different methods varies more on the circuit compilation problem because it contains more local optima.
We show in Figure 11 the CPU time and iterations for the improvement heuristic of selected instances. For each group of instances Energy, CNOT, NOT, and Circuit, the CPU time increases with the problem size. The number of iterations highly depends on the initial points.
In addition, we observe that for both continuous relaxation and ALB, the CPU time is exponentially increasing but the number of iterations increases slightly. We note that the main increase of the computational time comes from the time-evolution process for simulating the quantum system because the dimension of the Hamiltonian matrices increases exponentially with the number of qubits q. In our algorithm, we could directly conduct the time evolution on quantum computers with a control pulse, and then only use the final state to compute the objective value and approximate the gradient by the finite difference method. Hence, we do not need to evaluate the state at each time step, indicating the scalability of our approach for handling large-scale instances. In Figure 12 we show the objective values and TV regularizer values of the combinatorial integral approximation and the improvement heuristic. We show that the objective value of SUR keeps the same or slightly increases, demonstrating the theorems and propositions in Section 3.3.1. Adding min-up-time and max-switching constraints leads to an increase in objective values because of adding additional constraints stated in Section 3.3.2 to the feasible set. The increase from adding min-up-time constraints is higher because the restricted feasible region contains fewer feasible solutions. From the aspect of the chattering measured by the TV regularizer value, we show that SUR leads to more chattering compared with continuous solutions because of the rounding process. As discussed in Section 3.3.2, imposing min-up-time and max-switching constraints reduces the chattering, and adding min-up-time constraints reduces it more significantly because the constraints are stricter. The improvement heuristic reduces the objective values and chattering remarkably for the results obtained from all the methods with rounding techniques, especially pGRAPE, which shows the benefits of our improvement algorithms in Section 5. In Figure 12, the dot closest to the lower-left corner indicates that it obtains the best balance between improving objective values and reducing chattering. If the original objective function matters more, one can choose methods that consist of SUR and the improvement heuristic. If one focuses more on reducing the switches, the methods with min-up-time or max-switching constraints and the improvement heuristic are better.

Conclusion
In this paper we built a generic control model with both continuous and discretized formulations for the quantum control problem. We introduced a penalized squared L 2 function into the model to ensure that only one control is active at all times. In addition, we proposed a model with the TV regularizer aiming to reduce the absolute change of controls.
We developed an algorithmic framework combining the GRAPE approach, combinatorial integral approximation, and local-branching improvement heuristic. With numer- ical simulations on multiple examples with various objective functions and controllers, we demonstrated the feasibility of the discrete optimal quantum control problem and illustrated that our algorithms obtained trade-off controls between high quality and low absolute changes within a short computational time. Specifically, the performance of different relaxation models varies among instances, and therefore, testing all three methods (pGRAPE, TR, and ADMM) is helpful to select the best control. If one is more interested in obtaining controls with a lower energy or infidelity, we recommend methods using the sum-up-rounding technique. On the other hand, if one aims to reduce switches, it is better to choose methods with min-up-time constraints added. Both cases require the use of the approximate local branching algorithm to improve final solutions. In practice, the performance can be improved further by fine-tuning the corresponding parameters.
All the numerical simulations in our paper were conducted on classical computers , and the CPU time mainly comes from the time-evolution process of quantum states. Because our algorithm only requires the final state to compute the objective value and approximate the gradient, running the time-evolution process in our algorithms on quantum computers is easy in practice and could help eliminate a significant amount of the exponential slowdown as we observe in our simulations. Furthermore, with continuous control results, we can extract appropriate controller sequences and optimize the location of switching points to obtain high-quality controls. Switching-time optimization using continuous control results is future research.
Proof of Theorem 4. For simplicity, we use to denote (∆t). It is obvious that for any time step k = 1, . . . , T , exactly one control can be set to 1 by u b by the construction. Hence u b satisfies the SOS1 property. For a specific time step k = 1, . . . , T , define We assume that there exists a time step r such that the claim does not hold. For simplicity, we use i to represent the corresponding i r . We consider the assumption in two cases.

Case 1:
We assume that, Letk be the highest index of the time step in which control i is rounded up, Then by (37), we have, Because u b ik = 1, we know that i has the maximum value of (39a) among j = 1, . . . , N . Hence it follows from (39), Summing up over all the controls j, The first inequality comes from the definition of . This leads to 0 < −N (N − 1)∆t − (N − 1) , which is a contradiction because ≥ 0 and N ≥ 2.

Case 2:
We assume that, From the definition of , we have and by substituting (42) into (43), it holds that Because the left-hand side can be written as the sum of N − 1 terms as at least one of them must be negative, therefore there existsĵ such that Letk be the highest index of the time step in which controlĵ is rounded up, Then we havek The first inequality follows from u b jk = 1 and u b jk = 0, k ≥k. The second inequality follows from (46). Becauseĵ is the control which is rounded up at time stepk, for any j = 1, . . . , N , it holds that, Summing over all the controls, we have This contradicts the definition of the parameter .
Proof of Corollary 1. Based on the formulation of the discretized control in the continuous relaxation of the model (DQCP-L 2 ), we have From Theorem 4, we directly obtain the statement (22). From Theorem 3, if the original function F is bounded, the optimized squared L 2 term l(u c , T ) is uniformly bounded over time steps T , then the absolute integral error between continuous and discretized control converges with O( √ ∆t).
Proof of Proposition 2. For any time step k = 1, . . . , T , we have The last equality follows by Trotter expansion that e ∆t(A+B) = e ∆tA e ∆tB + O(∆t 2 ) and taking the limit as ∆t goes to zero. Similarly, for binary control, we have From Theorem 4, we have for any time step k = 1, . . . , T and controller j = 1, . . . , N , Combining with Corollary 1 and taking the limit as ∆t goes to zero, we have We substitute (55) into the formulation of states (52) and (53), then we obtain the conclusion that Substituting states into the objective function, we prove that

Appendix B Detailed Numerical Results
In this section, we present the results of continuous relaxation, combinatorial integral approximation, and the improvement heuristic for all the methods and instances in Table 4-9.
We also present the objective values and TV regularizer values for all the instances and annotate the best method in Figure 13.

Appendix C Control Results of NOT Estimation Problem
For the NOT gate estimation problem, we present the control obtaining the best trade-off between objective values and TV regularizer values which are annotated in Figure 14. All three instances show that controller 1 has a more significant impact on the final infidelity. When the evolution time is short (t f = 2), the optimal control sets controller 1 active all the time. When the evolution time is longer (t f = 6, 10), the active time of controller 2 is shorter. The main reason is that only considering controller 1 can also obtain low infidelity with a sufficiently long evolution time but the performance of a single controller is worse than two controllers with the same evolution time [49,86].
(a) t f = 2 (b) t f = 6 (c) t f = 10 The submitted manuscript has been created by UChicago Argonne, LLC, Operator of Argonne National Laboratory ("Argonne"). Argonne, a U.S. Department of Energy Office of Science laboratory, is operated under Contract No. DE-AC02-06CH11357. The U.S. Government retains for itself, and others acting on its behalf, a paid-up nonexclusive, irrevocable worldwide license in said article to reproduce, prepare derivative works, distribute copies to the public, and perform publicly and display publicly, by or on behalf of the Government. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan http://energy.gov/downloads/doe-public-access-plan.