Model predictive control for robust quantum state preparation

A critical engineering challenge in quantum technology is the accurate control of quantum dynamics. Model-based methods for optimal control have been shown to be highly effective when theory and experiment closely match. Consequently, realizing high-fidelity quantum processes with model-based control requires careful device characterization. In quantum processors based on cold atoms, the Hamiltonian can be well-characterized. For superconducting qubits operating at milli-Kelvin temperatures, the Hamiltonian is not as well-characterized. Unaccounted for physics (i.e., mode discrepancy), coherent disturbances, and increased noise compromise traditional model-based control. This work introduces model predictive control (MPC) for quantum control applications. MPC is a closed-loop optimization framework that (i) inherits a natural degree of disturbance rejection by incorporating measurement feedback, (ii) utilizes finite-horizon model-based optimizations to control complex multi-input, multi-output dynamical systems under state and input constraints, and (iii) is flexible enough to develop synergistically alongside other modern control strategies. We show how MPC can be used to generate practical optimized control sequences in representative examples of quantum state preparation. Specifically, we demonstrate for a qubit, a weakly-anharmonic qubit, and a system undergoing crosstalk, that MPC can realize successful model-based control even when the model is inadequate. These examples showcase why MPC is an important addition to the quantum engineering control suite.


Introduction
Quantum computation can be viewed as an assembly of analogue control pulses driving quantum states toward desired targets [5,43,49,67,75]. An accurate dynamical model is important for designing optimal control pulses sufficient for each application [13]. In particular, open-loop control strategies using model-based numerical methods have proven to be highly effective in some practical settings, such as for processors based on cold atoms, where models are well-known [22]. Sufficiently accurate models are obtained through careful device characterization [16]. However, descriptions of superconducting qubits [41] can fall short due to unknown Hamiltonian terms [46], instrument noise, control transfer functions [29], and/or unwanted coupling to unmodelled modes [73] or other systems [64]. To address these challenges, new frameworks must incorporate measurement feedback in addition to device characterization to optimize pulses for the desired targets. For example, an insufficient pulse from an inaccurate model can be updated retroactively using a modelfree optimization derived from experiment feedback [15,35]. In this work, we introduce the model predictive control (MPC) [47,63] framework since it possess a number of structural advantages for closed-loop control of systems with imperfectly known dynamics: (i) MPC inherits a natural degree of disturbance rejection due to measurement feedback, (ii) receding finite-horizon model predictions allow MPC to control complex multi-input multioutput systems under state and control constraints, and (iii) MPC is flexible enough to develop synergistically alongside control innovations like reinforcement learning [26,77]. We detail the success of MPC through a number of practical state preparations motivated by superconducting qubit applications. MPC [47,63] (see Figure 1) is a closed-loop optimization strategy that compliments existing approaches -that is, MPC is not a new type of controller on its own. Instead, standard MPC implements a sequence of open-loop controllers online as a function of the current state. First, a model prediction is made out to some finite horizon and used to inform the present control decision. Second, the decision is implemented, and the prediction begins again from the new measured state. For this reason, MPC is often referred to as a receding-horizon strategy. MPC has a history of practical successes ranging from the chemical process industry [62] to autonomous vehicles [18] and reusable rocket landings [17]. Notably, MPC for quantum dynamics is closely connected to MPC for classical control using Koopman-von Neumann theory, a description of classical mechanics using a Hilbert space of observables defined on phase-space. Koopman-von Neumann theory shares strong historical connections with quantum mechanics and has wide applicability to modern datadriven analysis [11,38,39,50,51,70]. Connections between Koopman-von Neumann theory and quantum control dynamics were studied in Reference [25]. Recent successes using MPC within the classical Koopman-von Neumann theory [2,10,20,40,57,58] can motivate synergistic development of MPC for quantum dynamics. In this work, we consider planning for control using fixed model discrepancies within MPC. In practice, model discrepancies can be improved by integrating data-driven modelling together with MPC. The integration of planning over a model and learning of a model requires making choices about how to best utilize the available data. There are many trade-offs to consider such as accuracy, interpretability, and data efficiency: for a review, see Reference [52]. One approach to the integration of planning and learning is data-driven model synthesis based on Koopman theory, using streaming data [21,59,76] to build or improve operatortheoretic models of the dynamics in an online way as the MPC horizon recedes. In the language of Reference [52], this approach is known as planning over a learned model, and is exemplified by approaches like Embed to Control (E2C) [72]. Error bounds for Koopman models of control systems were recently studied in [56,65], enabling the rigorous analysis of data-driven MPC schemes.
There is a distinction to make between the classical and quantum settings. When MPC is used in robotics, the implementation of a control decision results in the actual motion of the robot to a new position. Unlike the robot, a quantum state is a statistical outcome from the measurement of many identically-prepared quantum experiments. After running successive or parallel experiments to realize quantum state tomography [16], the next quantum experiment is reset at the initial (or perhaps projected) state. There is a possibility of using the state information to look backward and improve upon the previous model or pulse. Backward-looking, model-free algorithms iteratively search for parameter changes that improve upon desired control objectives. Such methods have been highly successful in calibrating controls to quantum experiments [15,35,74]. However, many iterations may be necessary to successfully search the parameter space; hence, many evaluations of the control objective may be required. Moreover, free parameters are often limited in number to maintain the feasibility of the optimization. The alternative MPC perspective is to solve the quantum problem as if it is an online optimization. The forward-looking MPC trades iterative improvements for extra online control time by accepting and proceeding from the current quantum state tomography outcome (we return to this comparison in detail in Appendix D after establishing an explicit context via our numerical examples). The resulting MPC control scheme is robust, as MPC inherits a natural degree of disturbance rejection due to the feedback. Moreover, MPC can handle the many parameters of multi-input, multi-output systems. Finally, MPC has an expansive literature base from which to build. In this paper, we demonstrate the utility of the MPC perspective in the context of quantum control engineering. First, we introduce our MPC implementation. Then, we proceed with examples of robust quantum state preparation for ideal qubits, weakly-anharmonic qubits, and qubits coupled by undesired crosstalk.

MPC for Quantum Control
In what follows, Section 2.1 describes the theoretical framework for standard MPC, restricted for simplicity to linear models with quadratic cost functions. Section 2.2 describes the nonlinear equations of motion for quantum control dynamics, and Section 2.3 shows how to modify standard MPC for the nonlinear quantum control dynamics by way of nonlinear MPC.

Background: MPC
The standard linear-quadratic MPC involves a discrete-time linear model, a quadratic cost function, and possibly linear constraints on the state and control. The optimization problem generated under these assumptions is a quadratic program (QP). With an initial state x 0 and reference trajectories , the QP is given by arg min The quadratic costs Q, R, Q f , dynamics A(t), B(t), and prediction horizon T are parameters. The QP returns an optimal state and control, X opt and U opt , for the entire prediction horizon. If no state or input constraints are introduced, then this problem could be solved using familiar linear quadratic methods [4]. Alternatively, more general constraints may be considered if the resulting optimization can be reformulated as a second-order cone program (SOCP), of which a QP is a special case [9,31]. Typical MPC applications involve real-time control of complex systems. This motivates work on faster and more efficient MPC algorithms [71]. Many powerful open-source and commercial MPC solvers exist today for a variety of purposes. In our work, we used CVXPY [14] (a research-friendly Python-embedded convex-programming library) together with the QP solver OSQP [68].

Quantum control dynamics
Markovian quantum control dynamics [3] are bilinear with respect to the state vector |ψ(t) ∈ C N and coherent control amplitudes u j (t) such that For a qubit, N = 2, while for a qudit, N = d. A sequence of n qudits forms a quantum register which can be used for quantum computations; in this case, the state grows exponentially with respect to the length of the register, N = d n . An ensemble of pure quantum states or registers can be completely characterized, in the sense of their measurement statistics, by a density matrix ρ(t); that is, a non-negative self-adjoint operator in C N ×N with trace one. The density matrix enables a simple characterization of Markovian interactions by the environment on the state in terms of dissipative dynamical operators; this is the Markovian master equation description of an open quantum system, where H(t) is a trace-zero Hermitian operator (corresponding with the system Hamiltonian), {D j } N 2 −1 j=1 is an orthonormal set of complex matrices with trace zero, and C := (c jk ) is positive semi-definite. For a closed quantum system C = 0, the equation simplifies to the quantum Liouville equation. In this paper, we use the density matrix to represent the quantum state for the purpose of describing control experiments for quantum state preparation. In order to continue to represent our density matrix as a state vector in line with our MPC intentions, we apply the vectorization operation (see Appendix A) In addition, where it is necessary to work with completely real state vectors we rely on the isomorphism between complex numbers and 2 × 2 matrices for the operators, To apply MPC for quantum control dynamics, the constraint in Equation (1c) is replaced by the discretization of the bilinear dynamics in Equation (3) [25]. To first order it is given by Our intended task in this paper is quantum state preparation, in which a quantum state represented by ρ is driven to realize a reference state ρ ref .
returns the vector of singular values σ j of a given matrix. In Equation (1b), we have formulated the MPC objective in the standard way as a least-squares cost, which penalizes deviations from a reference trajectory. In the case of the vectorized density matrix, this choice is consistent with overlap fidelity in the following way: if x is the corresponding vectorization of the density matrices, their norms are connected under the identity ρ 2 = x 2 (Appendix B). Quantum state preparation can be understood as a subroutine to design the control pulses necessary to realize a reference quantum process. In Appendix C, we discuss how MPC for quantum state preparation can be naturally extended to the pursuit of quantum gate synthesis, where the goal is to realize a finite set of unitary processes which can be combined to yield arbitrary quantum computations.

Nonlinear MPC
Continuous-time quantum control dynamics are bilinear. By introducing a general nonlinear dynamics constraint x(t + 1) = f (x(t), u(t), t) in place of the linear dynamics found in Equation (1c), we enter the realm of nonlinear MPC. Here, the necessary optimization problems are typically non-convex. Nonlinear MPC permits the use of more complex models, so applications are abundant. In recent years, practical realizations have improved alongside advances in numerical methods and non-convex optimization [63]. In this paper,

INPUT:
Initial state x 0 , initial guesses (X guess , U guess ), and references (X ref , while Not converged do 3: Compute the linearizations A guess (t), B guess (t) Eq. (7) 4: Compute α ∈ [0, 1] using line search 6: end while 8: (X opt , U opt ) ← (X guess , U guess ) 9: end function we treat the nonlinear MPC by following the sequential quadratic program (SQP) approach to solve the open-loop optimization between MPC steps. The SQP approach is outlined in Algorithm 1 [27]. SQP is a type of direct optimization: the state and control are explicitly treated as optimization variables which are constrained by the dynamics [61]. This is in contrast to indirect approaches (in quantum control, familiar examples include GRAPE [36], GOAT [45], or Krotov's method [23]), where the dynamics are used to eliminate the optimization over the state variables. SQP iteratively solves local QP approximations of the nonlinear optimization problem until convergence is achieved. To account for the approximate character of the problem, steps are taken by following a line search [55]. Each local QP approximation is taken about guess trajectories To run this modified optimization, we first linearize the dynamics f (x(t), u(t), t) such that Then, we replace Equation The discretization of the continuous time quantum control dynamics can be taken before or after the linearization in Equation (7) [27]. We pursue the case where discretization is taken first. For quantum control dynamics, we saw in Equation (6) that the discretized model contains a bilinear nonlinearity in the state and control. A bilinear nonlinearity is relatively simple, and it has been shown that successful and efficient nonlinear MPC controllers can be implemented even under guesses that account only for the initial state (i.e. x guess (t) = x 0 for t = 0, 1, . . . T − 1) [10,12,20]. The known bilinear structure can also be used to increase optimization efficiency [58]. Additional accuracy can be sought by using better discretizations or even data-driven numerical integrators [25,27]. Automatic differentiation tools can also prove impactful in these cases.
SQP can be warm-started after the implementation of the first control decision. At the initial timestep t = 0, suppose SQP was run in full to find an optimal state and control over the prediction horizon. As a by-product, good guess trajectories X guess 0 and U guess 0 for this prediction horizon have been obtained through SQP. All future guess trajectories can then be found via a shifting procedure applied to these good guess trajectories. A common practical implementation is as follows: obtain the warm-start guess trajectories for the initial timestep t from the previous X guess t−1 and U guess t−1 by eliminating the first value and duplicating the final value so that each sequence retains its length. Heuristically, the shifted guesses are close to the optimal value of the nonlinear program. Therefore, for all initial timesteps t > 0 the SQP can be terminated after just one iteration with α = 1 (see Algorithm 1) [27].

MPC for robust quantum state preparation
MPC can be utilized for many control strategies, such as setpoint stabilization (realizing a static reference state), tracking (following a time-dependent reference trajectory), and path following (staying close to any part of a time-independent reference trajectory at all times) [63]. Each strategy has different objectives and criteria for convergence and stability. Selecting a suitable control strategy is important for obtaining the best performance. In our examples, we consider MPC for setpoint stabilization to perform robust quantum state preparation in the presence of coherent noise and modelling inaccuracies. Section 3.1 introduces MPC for robust state preparation of a qubit, and discusses the implications of the measurement feedback period. Section 3.2 uses MPC to control a weakly anharmonic qubit assuming no model knowledge of the anharmonicity and compares the MPC treatment with an analytic pulse design. Finally, Section 3.3 looks at simultaneous state preparation of coupled qubits in the presence of an unmodelled coupling (i.e. crosstalk). Working in the individual (reduced) qubit spaces, MPC is used to overcome the crosstalk effect. For each of our ground truth simulations, we use the QuTiP Python package [32,33]. The following examples report the Hamiltonians of each toy system, with the understanding that the relevant modifications are made following Section 2.2 to obtain the dynamical model used to constrain the optimal control problem.

Qubit
Our first example is a qubit treated in a rotating frame within the rotating wave approximation. The Hamiltonian is where σ z and σ x are the usual Pauli matrices [41]. The prefactor ∆ = ω Q − ω R is the discrepancy between the qubit resonance frequency ω Q and the chosen rotating frame ω R . The control u(t) is the envelope of a drive pulse with a carrier frequency of ω R . Our goal is to apply an area-π pulse or swap of the occupation probability of the ground and excited states. To do this, we set a fixed ρ ref such that ρ ref 11 = 1 else zero.
In all examples, we only enforce on-axis terms (e.g. ρ 00 , ρ 11 ) in our cost function by way of the appropriate Q. We set Q f = Q, and R = 10 −2 1.
Mischaracterization of the qubit frequency ω Q results in an offset rotating frame and a nonzero drift term ∆ in the qubit Hamiltonian (visualized in Figure 2(a)). Open-loop controllers rely on highly-accurate models to find good pulse designs. In contrast, we will demonstrate that MPC does not require optimal open-loop solutions over its prediction horizon to realize good pulse designs; instead, MPC iterations are best understood as planning exercises. In Figure 2, MPC is used to design a control pulse swapping the occupation probability of the ground and excited states. The simulated qubit is assumed to be mischaracterized. Specifically, we force MPC to rely on an inaccurate model with ∆ = 0 while the simulation actually has been set to ∆/2π = −0.2/2π ≈ −30 MHz. For a base comparison, in Figure 2(b) we report the effect of an analytic area-π pulse. This is the naive design that would be realized by an open-loop controller if the ∆ = 0 model accurately reflected the simulation; we see that the presence of the nonzero ∆ in the simulation renders this analytic pulse insufficient.
We apply MPC in Figure 2(c). We set our prediction horizon to T = 10 ns. Feedback is provided via the simulated density matrix every 1.4 ns. In experiment, this feedback would necessitate the use of quantum state tomography [16]. We explore the effect of decreasing or increasing the feedback period in Appendix E. Recall that the MPC controller naturally accommodates constraints like those emerging from hardware limitations. To demonstrate, in Figure 2 notice that we have set constraints on both the maximum control amplitude (|u|/2π = 0.1) and the allowed initial change in control relative to the previous value (∆u/2π = 0.04). We enforce the same constraints for the analytic pulse in Figure 2(b). In general, these hard constraints can be turned off or softened (by allowing for some mild infractions of the desired inequalities) to increase the space of feasible controls and state trajectories within MPC. Upon comparing Figure 2(c) with Figures 2(b), we see how the MPC solution uses feedback to adjust and robustly counteract the model deficiencies observed during its receding-horizon iterations. In the introduction, a conceptual comparison between the forward-looking MPC and backward-looking, model-free algorithms (e.g. Reference [15,35,74]) was offered. To further elucidate this point, Appendix D illustrates this comparison by way of the toy qubit system introduced here in this section.

Weakly-anharmonic qubit
A qubit is implemented by restricting a system to a pair of accessible states, for example the ground eigenstate |0 and the first excited eigenstate |1 of a bare system Hamiltonian. If there existed higher level spacing equal to the |0 ↔ |1 transition, then driving the qubit at the |0 ↔ |1 frequency would lead to unwanted dynamics involving higher energy eigenstates. For this reason, the energy level spacing of a candidate qubit Hamiltonian must be nonlinear. In transmon qubits, the anharmonicity is the defined to be the difference between the |1 ↔ |2 level spacing and the |0 ↔ |1 level spacing of the system's bare Hamiltonian. The limit of infinite absolute anharmonicity results in a perfect qubit; in practice, transmon anharmonicities cannot be made arbitrarily large and are usually between −100 to −300 MHz [41]. Naively, the transmon anharmonicity isolates the qubit transition from exchange with higher energy levels. In practice, fast changes of the drive-pulse envelope widen the spectrum of the qubit control pulse in Fourier space. The wide spectrum can lead to a nontrivial overlap with the frequency of the |1 ↔ |2 transition and results in an undesired interaction involving the |2 state. The DRAG procedure (Derivative Reduction by Adiabatic Gate) is an analytic suppression of such leakage errors by coordinating control on both the σ x and σ y axes [41,53]. In DRAG, if the first control envelope is fixed to a Gaussian, then the second pulse is proportional to a Gaussian derivative term that eliminates the spectral weight at the |1 ↔ |2 transition. Numerical optimal control experiments have been shown to reproduce pulse designs similar to the DRAG scheme [53,73]. Figure 3: (a) i. In a weakly-anharmonic oscillator with unequal level spacing quantified by an anharmonicity of −100 MHz, the intended outcome is a swap of the ground and excited state probabilities ρ 00 ↔ ρ 11 . Here ρ jj is the occupation probability of the j-th state, |j .The DRAG procedure is an analytic suppression of leakage out of the qubit subspace defined by levels |0 and |1 . It is implemented by a Gaussian π pulse on the first control axis in (a) ii. and a Gaussian derivative on the second control axis in (a) iii. in order to cancel the spectral overlap of the qubit π-pulse at the energy splitting of the |1 ↔ |2 transition. In (a) iv., the ρ 22 population is suppressed and the ρ 11 is enhanced by the DRAG scheme. In (b) i., it is assumed that the anharmonicity is unmodelled and only a qubit model is available . In (b) ii-iii., MPC is used to design a pair of robust control pulses by relying on measurement feedback to reduce the leakage from the qubit subspace. In Figure 3, we consider a weakly-anharmonic transmon qubit. Our simulation Hamiltonian is defined within the |0 , |1 , |2 space. We use the rotating frame of the |0 ↔ |1 transition and take the rotating wave approximation to realize where u x (t) and u y (t) are the envelopes of pulses driven at the |0 ↔ |1 transition. In the |0 , |1 , |2 space, the transmon control operations are expressed in terms of truncated raising and lowering operators as σ x → (a + a † )/2 and σ y → i(a − a † )/2. The transmon anharmonicity is α = −0.6 or α/2π ≈ −100 MHz. To demonstrate the robustness of MPC, we suppose that the transmon anharmonicity is unmodelled. That is, our MPC framework is forced to use a model with α = 0. This is similar to the situation demonstrated in Section 3.1, but now there are two control pulses that must operate in tandem. With only a qubit model available to our controller, we rely on MPC to design robust pulses for swapping the ground and excited state probabilities. The prediction horizon is set to 10 timesteps (4 ns) and the feedback timestep is 0.4 ns. Like in Section 3.1, we limit the maximum control amplitude (|u| ≤ 0.75) and constrain the allowed initial change in control relative to its previous value (|∆u| ≤ 0.2). Our constraints are chosen so that the MPC pulses are similar to the 10 ns analytic DRAG pulses shown in Figure 3(a). To implement our DRAG scheme, we set u y (t) = 0.6u x (t)/|α|. The constant of proportionality is a dimensionless scaling parameter, which we set to the value in [0, 1] that maximizes the fidelity of the state preparation [41]. In comparing Figure 3(a) to Figure 3(b), we see that MPC is able to coordinate the two control pulses and design a trajectory for the occupation probabilities similar to that of the analytic scheme; significantly, MPC accomplishes this without knowledge of the simulation anharmonicity. In fact, we observe the MPC pulse outperforms our simple version of the analytic DRAG scheme despite this hindrance. In Appendix F we add additional context by successfully applying MPC to two more cases: (i) further reducing and (ii) increasing model awareness of the underlying simulation.

Crosstalk
When calibrating gates and algorithms for quantum processors at scale, additional terms in the subspace dynamics may appear due to unintended crosstalk between otherwise independent parts of the full quantum system [64]. We suggest simultaneously applying MPC to the parts that would, without crosstalk, have known models and independent control objectives. For this purpose, we recall the reduced density matrix Tr E {ρ SE } = ρ S for a quantum state ρ ∈ H S ⊗ H E in the joint Hilbert space of a system S and environment E. Here, Tr E is the partial trace over H E . In the following discussion, the feedback state supplied to MPC is the concatenation of the reduced density matrices for each of the otherwise independent parts of the full quantum system (i.e. for two parts A and B of a joint system, the feedback state is the concatenation of ρ A = Tr B {ρ AB } and ρ B = Tr A {ρ AB } as will be described below). As the MPC control horizon recedes, control decisions are made using just the information in these parts to correct for the unintended effects of the crosstalk. At a given time step in the control design, we assume that any observed crosstalk effects were coherent and fixed by the past control decisions. That is, repeat experiments experience the same crosstalk. Corrections based on MPC are made by way of the future evolution. This feature is important because any changes made to past controls modify the form of the past crosstalk in new and uncertain ways.
In Figure 4, we use MPC to implement state preparation for two qubits coupled by unintended crosstalk. The qubit spaces are indexed by A and B, and the desired state preparations are denoted redundantly as A and B. The total system Hamiltonian (each qubit in the rotating frame after the rotating wave approximation) is with u A (t) and u B (t) the control envelopes of the on-resonance carrier pulses. Here, we choose ξ = 0.5 (ξ/2π ≈ 80 MHz). We use separate axes to implement the A and B operations: respectively, H A (t) = u A (t)σ A x /2 and H B (t) = u B (t)σ B y /2. This allows the trajectories of the reduced states to be somewhat distinct, while simultaneously retaining for each operation the intuition developed during Section 3.1. We assume that the model used by MPC is unaware of the crosstalk in the simulation Hamiltonian, and that the feedback state for MPC is the concatenation of the reduced density matrices ρ A and ρ B with otherwise independent objectives A and B. The MPC prediction horizon is set to 10 timesteps (6 ns) with a feedback timestep of 0.6 ns and constraints are added identical to Section 3.1. In Figure 4(a), we show the initial density matrix which is in the ground state of the joint system, ρ AB (0) = |00 ≡ |0 A ⊗ |0 B . Expressed in the joint system, the target or reference state is ρ ref = |11 ≡ |1 A ⊗ |1 B . In Figures 4(b)-(c) we report the result of different state preparation experiments after 25 ns. For context, recall that the single qubit state preparations performed in Sections 3.1-3.2 were realized in < 10 ns. In Figure 4(b), we apply MPC to design a pair of control pulses for the case of a simulation where no crosstalk is present. In this situation, the model used by MPC matches the simulation. The MPC solution is comparable to two analytic π pulses applied to both qubits. This is because The initial density matrix is in the ground state ρ(0) = |00 00| where |00 ≡ |0 A ⊗ |0 B . The height of the bar indicates the magnitude of that entry in the density matrix, and the color is the phase. The goal is to send the ground state to the excited state for both qubits. This means the target or reference state is set to ρ ref = |11 11|. (b) Suppose the crosstalk term is not present so u A , u B are analytically two π-pulses realizing the independent qubit state preparations. (b) i. In a simulation without crosstalk, the density matrix the coupled system (shown at 25 ns) is successfully prepared in the reference state. (b) ii. In a simulation with crosstalk, the same pulses fail to correctly prepare the density matrix of the coupled system (shown at 25 ns). (c) We force model predictive control (MPC) to use a model with zero crosstalk to design the controls for a simulation where crosstalk is present. The state of the coupled system (shown at 25 ns) is successfully prepared by MPC. (d) The MPC framework is able to robustly prepare the state using only feedback from the concatenation of the reduced density matrices ρ A , ρ B in order to overcome the crosstalk modelling discrepancy. (e) The settling time of the successful MPC control is > 20 ns (shaded region); without crosstalk, the settling time was < 10 ns.
the two systems were uncoupled, so the feedback state defined by the concatenation of the reduced density matrices ρ A and ρ B provides complete information about the independent system dynamics. Indeed, we see in Figure 4(b) i. that this pair of π-pulse proxies prepares the appropriate joint reference state. For comparison, Figure 4(b) ii. shows the result of using the same pulses to control a simulation where a crosstalk term (Equation (11)) couples the two systems. Observe that this case (which corresponds to using an analytic or open-loop solution based on an incorrect model) results in a failed state preparation.
Contrast Figure 4(b) with Figure 4(c)-in the latter, we apply MPC to realize our reference state in a simulation with crosstalk using a model that has no crosstalk term. The control pulses designed by MPC are able to use simulation feedback to overcome the model discrepancy from the unknown crosstalk. Figure 4(c) shows that the joint density matrix ρ AB of the coupled system after 25 ns is successfully prepared in the reference state. Figure 4(d) is a cartoon to emphasize that the control pulses are designed using a feedback state defined by the concatenation of just the reduced density matrices ρ A and ρ B . In Figure 4(e) we see the control pulses designed by MPC. Observe that the forward-looking MPC addresses the unknown crosstalk by using feedback to compensate for unplanned dynamics. This results in longer settling times > 20 ns when compared with the single qubit state preparations of the uncoupled system which were realized in < 10 ns.

Conclusion
Model predictive control (MPC) allows for robust quantum state preparation in the presence of model uncertainty. MPC gains a natural degree of disturbance rejection by relying on feedback to update its receding horizon control plan. In Sections 3.1-3.3 we demonstrated this using prototypical superconducting qubit control simulations. In our examples, we showed MPC is able to design successful control pulses when forced to rely on a model that is mismatched from the underlying simulation. Our examples focused on setpoint stabilization (realizing a static reference state), but MPC can be utilized for other control strategies like tracking (following a time-dependent reference trajectory) and path following (staying close to any part of a time-independent reference trajectory at all times) [63].
MPC is able to integrate state-of-the-art trajectory-based optimization approaches within its receding horizon architecture to improve outcomes and computational tractability at scale. Well-designed optimization constraints can increase robustness and can be used to seek time-optimal controls [61]. Tube MPC can be used in the presence of bounded external disturbances to keep the actual state within an invariant tube around the nominal trajectory [19,44,48]. A state observer included in trajectory-based optimization can help limit the number of measurements needed to estimate the quantum state [42]. For many-qubit systems, trajectory-based controllers have been developed to assist with the exponential increase of the quantum state space [1,28,66]. These ideas will be incorporated in future work.
MPC is also simple and flexible enough to complement other existing control strategies. It is often useful to rely on data-driven models in MPC [7,8,34]; for quantum dynamics, one direction is through bilinear dynamic mode decomposition [25]. MPC and reinforcement learning (RL) have different benefits and costs, and the two can benefit from each other [26]. MPC can provide the expert demonstration needed to initialize the learning process in RL approaches [60]. RL approaches are successful at improving quantum optimal control [6,54] but there are significant data costs when training this global model-free approach. The MPC architecture introduced here is robust, interpretable, and requires less data than RLbased strategies. Furthermore, model-based strategies like MPC provide greater potential for extrapolation and generalization during control planning.

Code Availability
Code supporting the findings of this study is openly available at the following URL: https://github.com/andgoldschmidt/MPC4quantum. defined in terms of the 2 -norm. The state is x ≡ vec(ρ) as discussed in Section 2.2 and Appendix A. Commensurate with this trajectory-based optimization, we noted in Section 2.2 that state preparation is often described by quantum information theorists in terms of the state fidelity F (ρ, Fidelity approximates (by the Fuchs-van de Graaf inequalities) the distance measures of quantum states given by the Schatten p-norms, ρ − ρ ref returns the vector of singular values σ j of a given matrix [37]. If both ρ and ρ ref are pure states, then these two perspectives are equivalent, and

C Gate synthesis
In this paper, we have focused on quantum state preparation to illustrate the application of MPC for quantum dynamics. In the case of quantum control for the purposes of quantum computation, the more desirable comparison is to quantum gate synthesis. It is known that quantum state preparation can be thought of as a subroutine for gate synthesis. Indeed, gate synthesis can also be defined as the realization of a reference state under the action of control, as we have done for quantum state preparation. In gate synthesis, the reference state is no longer a density matrix, but is instead a unitary quantum process. For a closed quantum system, a unitary quantum process is equivalent to time evolution, ρ(t) = U (t)ρ(0)U † (t). From this, observe that the reference state which is prepared by a fixed gate depends on the initial state ρ(0). This means that the synthesized control pulses for a gate must simultaneously realize the state-dependent reference process for all initial states. It has been shown that for gate synthesis, it is sufficient to consider the simultaneous state preparation of just three specific initial states [24]. Therefore, to realize gate synthesis, we can directly apply our MPC algorithm to these concatenated states under dynamics given by the appropriate direct sum of the individual models. Alternatively, consider the analogy that connects the density matrix associated with a pure state to the quantum process associated with a unitary matrix. First, vectorize the unitary matrix (assume row-major vectorization), vec(U ) ≡ |U . Now the equivalence can be used to directly map our previous work to the new problem. For example, the reference is now a process matrix, P ref . Note that there are different fidelity measures to score the success of quantum gate synthesis, so the 2 -norm used in the trajectory-based optimization must be interpreted accordingly [37].

D Comparison with model-free descent
In this appendix, we compare MPC for pulse synthesis to model-free descent in the parameter space of the control pulse. This latter perspective is a simplification of the approach suggested in, e.g. References [15,35,74]. We compare the two approaches by tracking the iterations of quantum state tomography (QST) required to implement a control pulse that satisfies the state preparation objective from Section 3.1 ( Figure 2). In MPC, each QST instance is used to evaluate the current state for the purpose of determining the forward-looking open-loop control optimization. Each MPC step requires one QST iteration to assess the current state. In model-free descent, QST is used to determine the final pulse infidelity. Each QST iteration is an evaluation of this infidelity, which occurs after the application of a complete control sequence. It should be noted that one advantage of model-free descent, as advocated in [15,35,74], is the use of methods like randomized benchmarking in lieu of QST [16] for the purpose of evaluating a proxy for the direct infidelity.
We use the Nelder-Mead algorithm [15] to calibrate the control pulses via model free descent. The Nelder-Mead algorithm starts with a simplex of one dimension larger than the number of parameters, and chooses new search directions based on the objective values of this simplex. In this appendix, we slightly modify the example from Section 3.1: we take 1 ns timesteps and consider a total control window of 10 ns, meaning that we have 10 independent control parameters determining our desired pulse. Notice that this means a direct application of the Nelder-Mead algorithm will require an 11-point simplex. We allow for discrepancies, ∆ ∈ [−0.36, 0.36] GHz, with the true model given by ∆ = 0 (for more on these choices, see Figure E.1 in Appendix E). In particular, we consider 11 uniformly-sampled ∆ 1 , ∆ 2 , . . . , ∆ 11 . In Figure D.1 (b) we show the infidelity that results from applying MPC with horizon 5 ns from step 1 to step 10 of the pulse synthesis. The reported infidelity has been averaged over the 11 uniformly-sampled values ∆, with error given by the standard deviation. For the full range of models, MPC is able to prepare the state after 10 iterations of QST (equivalently, 10 steps of the MPC algorithm). In Figure D.1 (c) and (d) we instead consider model-free descent for two different initial simplexes. The first, in Figure D.1 (c), we refer to as model-aware. We take as the coordinates of our simplex the 10 ns pulses determined by solving the open-loop optimal control problem for each ∆ 1 , ∆ 2 , . . . , ∆ 11 . Notice that the infidelity of the open-loop control varies depending on how closely the model ∆ reflects the true ∆ = 0 of the simulation. We call this approach model-aware because we are in effect reducing the model-free calibration to a trial-and-error search over the discrepancy value. Again, each iteration is an evaluation of the infidelity and therefore treated as a QST instance equivalent to an MPC step: just setting up the simplex demands 11 QST iterations. In Figure D.1 (d), we seed the construction of the initial simplex using a single random 10 ns pulse within the allowed range of control amplitudes. The remaining coordinates of the simplex are chosen by separately incrementing each component by the maximum ∆u (a default approach for implementations of Nelder-Mead like in SciPy [69]). In the example in this appendix, we have set the control saturation at |u|/2π = 0.1 and |∆u|/2π = 0.05, to match the example in the main text. A total of 10 random initial simplexes are considered, providing the mean and variance of the sample mean reported in Figure D.1 (d).
Observe that in the fully model-free application of the Nelder-Mead descent seen in Figure D.1 (d), relatively many iterations were necessary to successfully search the parameter space. Indeed, if we extrapolate to a case where there are more than 10 free parameters, even the data required for the initial simplex becomes costly-this without even beginning the descent. We see in Figure D.1 (b) that the alternative MPC perspective solved the quantum problem as if it was an online optimization. The forward-looking MPC was able to trade iterative improvements for the possibility of extra online control time by accepting and proceeding from the current quantum state tomography outcome. We see that the resulting MPC control scheme was robust across a range of modeling inaccuracies, as MPC inherits a natural degree of disturbance rejection due to the feedback. Moreover, MPC can handle the many parameters of multi-input, multi-output systems because it is only solving a forward-looking open-loop optimization at each step.

E MPC limits: Feedback period and discrepancy
A control strategy based on MPC will eventually break down if the model discrepancy is too great or the feedback is too infrequent. In this appendix, we explore these limits within the context of the example in Section 3.1. We consider fixed model discrepancies within MPC. In practice, model discrepancies can be improved by integrating data-driven modelling (e.g., Reference [25]) together with MPC: indeed, the error bounds for Koopman models of control systems were recently studied in [56,65], allowing for a more careful study of data-driven MPC schemes.
In addition to the fixed model discrepancy, the other source of error is the frequency with which the MPC controller incorporates measurement feedback. In applications of MPC for robotics, it is possible that there is a timing mismatch between feedback frequency and MPC runtime. A common challenge is the case where the robot dynamics  Figure 2. The vertical axis, ∆, is the discrepancy between the simulation and model qubit frequencies. The horizontal axis measures the number of model predictive control (MPC) iterations undertaken before receiving the next state feedback from the qubit simulation. For any intermediate iterations, MPC was closed using state feedback from the possibly-inaccurate model predictions. For example, at ∆ = 0 MHz, the model coincided with the simulation so an optimal infidelity (with respect to our numerics) was obtained for all feedback periods. Each dot represents one MPC computation using full sequential quadratic programming (SQP); the coloring was performed via linear interpolation between dots. The results in Figure 2(c) were obtained from the parameters labelled by the star. and corresponding feedback occur faster than the MPC algorithm can return a control decision. Fortunately, the ensemble nature of the quantum experiments used to compute the quantum state tomography limits this concern. On the other hand, in robotics there is also the case where measurements occur slower than the timescales demanded by the model dynamics-a similar situation emerges when trying to limit the number of times the quantum state tomography is performed. In this latter case, the MPC algorithm can be closed using the model of the dynamics until the feedback time is reached, at which point a new quantum state tomography can be performed. Reliance on model predictions can be effective for limiting the number of measurements, but introduces additional dependence on the quality of the model.
In Figure E.1, we looked at the infidelity of the control as a function of the discrepancy ∆ and how often we rely on the model to close the MPC loop. The infidelity of the state preparation was measured after 15 ns (see Figure 2) and is colored by order of magnitude. The horizontal axis reports measurement feedback period in steps (the MPC timestep is 0.2 ns). This period is the number of MPC iterations for which a model prediction was used to close the loop before a sample from the qubit simulation was used. That is, a long measurement feedback period relies more heavily on a possibly-inaccurate model. Observe that at ∆ = 0 MHz in Figure E.1 the model prediction coincided with the qubit simulation, so for any measurement feedback period we attain an optimal infidelity (for our numerics).

F DRAG
Additional exploration of the weakly anharmonic qubit is pursued in this section. Figure F.1 is an expanded version of Figure 3 in the main text. New examples appear as new rows, while columns i.-iv. remain the same as the corresponding columns in Figure 3 of the main text. An additional analytic example in Figure F.1(a) shows the result of failing to apply the DRAG correction to the weakly anharmonic qubit. Figure F.1(b) is the same as This was also observed in, e.g. References [53,73]. Figure F.1(d) is the same as the main text Figure 3(b) showing MPC in the case where the model does not include the term quantifying the anharmonicity used for the simulation. In Figure F.1(e), we further reduce the model knowledge by forcing MPC to rely entirely on the qubit subspace. That is, unlike Figure F.1(d), the feedback state is given only in the reduced qubit subspace and not in the full |0 , |1 , |2 space. Observe in Figure F.1(e) that a significant enhancement of the ρ 11 population is still achieved. In such a case as this, where the qubit model is presumed to be the best estimate of the simulation, the resulting analytic pulse is something like the Gaussian π pulse shown in Figure F Figure 3 in the main text. In column iv., recall ρ jj is the occupation probability of the j-th state. In (a), we show the result of failing to include the analytic DRAG correction on the second control term. (b) is the same as Figure 3(a). Parts (c)-(e) rely on MPC under differing assumptions. In the MPC section, column iv. has open circles on the trajectories to indicate the quantum state feedback used by MPC. In (c), the model is assumed to match the simulation. Notice that in this case the control design closely resembles the analytic DRAG design in (b). Part (d) is the same as Figure 3(b). In (e), the model is restricted to the qubit subspace and MPC is implemented by taking only measurements in this subspace. This is illustrated in (e) iv. by the absence of open circles over the ρ 22 trajectory. Even in this limited case, MPC is able to enhance the population of the ρ 11 state significantly, in contrast to the analytic Gaussian π-pulse in (a).