Mitiq: A software package for error mitigation on noisy quantum computers

We introduce Mitiq, a Python package for error mitigation on noisy quantum computers. Error mitigation techniques can reduce the impact of noise on near-term quantum computers with minimal overhead in quantum resources by relying on a mixture of quantum sampling and classical post-processing techniques. Mitiq is an extensible toolkit of different error mitigation methods, including zero-noise extrapolation, probabilistic error cancellation, and Clifford data regression. The library is designed to be compatible with generic backends and interfaces with different quantum software frameworks. We describe Mitiq using code snippets to demonstrate usage and discuss features and contribution guidelines. We present several examples demonstrating error mitigation on IBM and Rigetti superconducting quantum processors as well as on noisy simulators.

We introduce Mitiq, a Python package for error mitigation on noisy quantum computers. Error mitigation techniques can reduce the impact of noise on near-term quantum computers with minimal overhead in quantum resources by relying on a mixture of quantum sampling and classical post-processing techniques. Mitiq is an extensible toolkit of different error mitigation methods, including zero-noise extrapolation, probabilistic error cancellation, and Clifford data regression. The library is designed to be compatible with generic backends and interfaces with different quantum software frameworks. We describe Mitiq using code snippets to demonstrate usage and discuss features and contribution guidelines. We present several examples demonstrating error mitigation on IBM and Rigetti superconducting quantum processors as well as on noisy simulators.

Introduction
Methods to counteract noise are critical for realizing practical quantum computation. While fault-tolerant quantum computers that use error-correcting codes are an ideal goal, they require physical resources beyond current experimental capabilities. It is therefore interesting and important to develop other methods for dealing with noise on near-term quantum computers.
To these ends, we introduce Mitiq: a software package for error mitigation on noisy quantum computers. Mitiq is an open-source Python library that interfaces with multiple quantum programming front-ends to implement error mitigation techniques on various real and simulated quantum processors. Mitiq supports Cirq [19], Qiskit [20], pyQuil [21], and Braket [22] circuit types and any back-ends, real or simulated, that can execute them. The library is extensible in that new front-ends and back-ends can be easily supported as they become available. Mitiq currently implements zero-noise extrapolation (ZNE), probabilistic error cancellation (PEC), and Clifford data regression (CDR), and its modular design allows support for additional techniques, as shown in fig. 1. Error mitigation methods can be applied in a few additional lines of code, but the library is still flexible enough for advanced usage.
In section 2, we show how to get started with Mitiq and illustrate its main usage. We then show experimental and numerical examples in section 3 that demonstrate how error mitigation with Mitiq improves the performance of noisy quantum computations. In section 4, we describe in detail the zero-noise extrapolation module. In section 5, we give an overview of the probabilistic error cancellation module. In section 6, we present the Clifford data regression module. We discuss further software details and library information in section 7 including future development, contribution guidelines, and planned maintenance and support. Finally, in section 8 we discuss the relationship between Mitiq and other techniques for dealing with errors in quantum computers.

Software Framework Circuit Type
Cirq cirq.Circuit Qiskit qiskit.QuantumCircuit PyQuil pyquil.Program Braket braket.circuits.Circuit Table 1: The quantum software frameworks compatible with Mitiq. Since Mitiq interacts with circuits but is not directly responsible for their execution, supporting a new circuit type requires only to define a few conversion functions. Therefore, we expect the list in this table to grow in the future.
package installed, the QPROGRAM type will be the union of a Cirq Circuit and a Qiskit QuantumCircuit. If pyQuil is also installed, QPROGRAM will also include the pyQuil Program type. The source code for Mitiq is hosted on GitHub at https://github.com/unitaryfund/mitiq and is distributed with an open-source software license: GNU GPL v. 3.0. More details about the software, packaging information, and guidelines for contributing to Mitiq are included in section 7.

Main usage
To implement error mitigation techniques in Mitiq, we assume that the user has a function which inputs a quantum circuit and returns the expectation value of an observable. Mitiq uses this function as an abstract interface of a generic noisy backend and we refer to it as an executor because it executes a quantum circuit. The signature of this function should be as follows: Mitiq treats the executor as a black box to mitigate the expectation value of the observable returned by this function. The user is responsible for defining the body of the executor, which generally involves: 1. Running the circuit on a real or simulated QPU.
2. Post-processing to compute an expectation value. 3. Returning the expectation value as a floating-point number.
Example executor functions are shown in appendix A. Since Mitiq treats the executor as a black box, circuits can be run on any quantum processor available to the user. For example, we present benchmarks run on IBM and Rigetti quantum processors as well as on noisy simulators in section 3.
Once the executor is defined, implementing a standard error mitigation technique such as zero-noise extrapolation (ZNE) needs only a single line of code: 1 from mitiq . zne import execute_with_zne 2 3 zne_value = execute_with_zne ( circuit , executor ) Codeblock 5: Using Mitiq to perform zero-noise extrapolation. The circuit is a supported quantum program type, and the executor is a function which executes the circuit and returns an expectation value.
The execute_with_zne function uses the executor to evaluate the input circuit at different noise levels, extrapolates back to the zero-noise limit and then returns this value as an estimate of the noiseless observable. Figure 2 shows a high-level workflow.
As described in section 4, there are multiple techniques to scale the noise in a quantum circuit and infer (extrapolate back to) the zero-noise limit. The default noise scaling method used by execute_with_zne is random local unitary folding [13] (see section 4.1), and the default inference technique is Richardson extrapolation (see section 4.2). Different techniques can be specified as arguments to execute_with_zne as follows. Figure 2: Overview of the zero-noise extrapolation workflow in Mitiq. An input quantum program is converted into a set of noise-scaled circuits defined by a noise scaling method and a set of noise scale factors. These auxiliary circuits are executed on the back-end according to a user-defined executor function (see appendix A for examples) producing set of noise-scaled expectation values. A classical inference technique is used to fit a model to these noise-scaled expectation values. Once the best-fit model is established, the zero-noise limit is returned to give an error-mitigated expectation value. Codeblock 6: Providing arguments to execute_with_zne to use different noise scaling methods and inference techniques.
In addition to zero-noise extrapolation, one might be interested in applying a different error mitigation technique. For example, probabilistic error cancellation (PEC) [2,4] is a method which promises to reduce the noise of a quantum computer with the only additional resource requirement being a higher sampling overhead.
Assuming the user has defined an executor as described above, PEC can be applied as follows: 1 from mitiq . pec import execute_with_pec Codeblock 7: Using Mitiq to perform probabilistic error cancellation. The circuit is a supported quantum program type, the executor is a function which executes the circuit and returns an expectation value and the representations argument contains information about the quasi-probability representations of the ideal gates in terms of the hardware noisy gates. This Codeblock is a templatea complete, executable example can be found in the Mitiq documentation (see section 7.2).
The execute_with_pec function internally samples from a quasi-probability representation of the input circuit that depends on the input representations of individual gates (see section 5.2 for more details on gate representations). The user-defined executor is used to run the sampled circuits. Eventually, execute_with_pec combines the results and returns an unbiased estimate of the ideal observable. As schematically represented in fig. 6, the workflow is very similar to the previous case of ZNE (shown in fig. 2) but, in this case, the noisy circuits are sampled probabilistically and executed at the base noise level of the underlying hardware (noise scaling is not used).
The code examples shown in codeblock 4 to 7 demonstrate the main usage of Mitiq. Alternatives to the execute_with_zne and execute_with_pec functions are described in section 7.1 -these alternatives implement the same methods but offer different ways to call them which may be more convenient, depending on context.
In the following section, we show results of benchmarks using Mitiq on IBM and Rigetti quantum processors as well as noisy simulators. We then explain the structure of the library in more detail.

Randomized benchmarking circuits
Noise scaling Noise scaling IBM Q London Rigetti Aspen-8 Figure 3: Zero-noise extrapolation on two-qubit randomized benchmarking circuits run on (a) the IBMQ "London" quantum processor and (b) the Rigetti Aspen-8 quantum processor. Results are obtained from 50 randomized benchmarking circuits which contain, on average, 97 single-qubit gates and 17 two-qubit gates for (a) and 19 single-qubit gates and 7 two-qubit gates for (b). Noise is increased via random local unitary folding (see section 4.1), and markers show zero-noise values obtained by different extrapolation techniques (see section 4.2). For example, the red circle is obtained by fitting a quadratic polynomial to the data points (blue), whereas the purple square is obtained by fitting an exponential decay to the same data points. (Note that some markers are staggered for visualization, but all are extrapolated to the zero-noise limit.) In this example, the true zero-noise value is 00|ρ|00 = 1. For (b), qubits 32 and 33 are used on the Aspen-8 processor, while for (a) the same two qubits are not necessarily used for each run. For linear, quadratic and exponential extrapolations, all data points are used to fit the corresponding extrapolation functions. For Richardson extrapolation, we use only three data points (first, middle, and last), corresponding to a quadratic interpolation of the three points. Figure 3 shows the effect of zero-noise extrapolation on two-qubit randomized benchmarking circuits run on both IBM and Rigetti quantum computers. The blue curve shows the expectation value 00|ρ|00 (which should be 1 for a noiseless circuit where ρ = |00 00|) at different noise levels, and markers show mitigated observable values obtained from different inference techniques. Error bars show the standard deviation over fifty independent runs.
Depending on the noise model as well as base noise level, different inference techniques can provide better zero-noise estimates. The aim of the experiments shown in fig. 3 is to demonstrate how Mitiq can be used to easily apply different extrapolation techniques on different backends. In this work, we are not interested in a rigorous comparison of the performances of different extrapolation methods, since this would require a much more detailed experimental and statistical analysis.
We discuss inference techniques more in section 4.2 and the limitations of zero-noise extrapolation more in section 8.1.

Potential energy surface of H 2
We now consider a canonical example of computing the potential energy surface of molecular Hydrogen using the variational quantum eigensolver. We follow Ref. [24] and use the minimal STO-6G basis and Bravyi-Kitaev transformation to write the Hamiltonian for H 2 as Here, g i are numerical coefficients which depend on the atomic separation and I, X, Y and Z are Pauli operators. We use the same single-parameter variational circuit shown in fig. 1 of Ref. [24] and we minimize the expectation value of the Hamiltonian given in Eq.
(1) via independent brute force optimizations evaluated for different values of the atomic distance (bond length). We simulate the experiment with and without error mitigation, assuming the presence of single-qubit depolarizing noise with error probability p = 0.05 (acting after each layer of gates).     fig. 4(b) shows the mitigated energy surfaces. To compute the mitigated curves, we use zero-noise extrapolation with random local unitary folding (see section 4.1) and secondorder polynomial inference (see section 4.2). As can be seen, the mitigated curves overlap with the true noiseless curve much more closely than the unmitigated curves. The error is quantified in fig. 4(c).

Probabilistic error cancellation example
We finally consider a toy example where Mitiq is used to apply probabilistic error cancellation. Consider the simple two-qubit circuit shown in the inset of fig. 5, corresponding to U = CNOT 1,2 • X 1 • H 2 (where H 2 is the Hadamard gate applied on the second qubit) 1 . Assume that we want to measure the expectation value of O = |00 00|, whose exact theoretical value is zero. We also assume that each gate of the (simulated) backend is followed by local (single-qubit) depolarizing noise with error probability p = 0.1. Because of such noise, the unmitigated expectation value is nonzero (0.0622). However, after using Mitiq to implement PEC, one can improve the estimate by almost an order of magnitude (0.0071). The results are reported in fig. 5, where the histogram of the raw PEC samples is also visible.

Zero-noise extrapolation module
We now describe the Mitiq library in more detail. The module structure is shown in fig. 1 and includes a module to interface with supported quantum programming frameworks, several modules associated to different error mitigation techniques, and a module for benchmarking such techniques.
In this section, we focus on the zero-noise extrapolation module mitiq.zne, while other error mitigation modules are considered in the next sections.
Zero-noise extrapolation was first introduced in [2,3] and works by intentionally increasing (scaling) the noise of a quantum computation to then extrapolate back to the zero-noise limit. More specifically, let ρ be a state prepared by a quantum circuit and E † = E be an observable. We wish to estimate Tr[ρE] ≡ E as though we had an ideal (noiseless) quantum computer, but there is a base noise level γ 0 which prevents us from doing so. For example, γ 0 could be the strength of a depolarizing channel in the circuit. The idea of zero-noise extrapolation is to compute where (real) coefficients λ i ≥ 1 scale the base noise γ 0 of the quantum computer. After this, a curve is fit to the data collected via Eq.
(2) which is then extrapolated to the zero-noise limit. This produces an estimate of the noiseless expectation value E .
To implement zero-noise extrapolation, we thus need two subroutines: 1. A means of scaling the noise γ i = λ i γ 0 for different scale factors λ i , and 2. A means of fitting a curve to the noisy expectation values and extrapolating to the zero-noise limit.
In the remainder of this section, we describe how these subroutines are implemented in Mitiq, showing several methods for both noise scaling as well as fitting/extrapolation, which we also refer to as inference.

Noise scaling
In one of the first formulations of zero-noise extrapolation [2], noise is scaled in superconducting processors by implementing pulses at lower amplitudes for longer time intervals.
Considering that most quantum programming languages support gate-model circuits and not pulse-level access, it can be convenient to scale noise in a manner which acts on unitary gates instead of underlying pulses. For this reason, Mitiq implements unitary folding, introduced in [13], as a noise scaling method.

Unitary Folding
Unitary folding works by mapping gates (or groups of gates) G to This leaves the ideal effect of the circuit invariant but increases its depth. If G is a gate of the circuit, we refer to the process as local folding. If G is the entire circuit, we call it global folding. In Mitiq, folding functions input a circuit and a scale factor -i.e., a number to increase the depth of the circuit by. (In Eq. (2), each coefficient λ i is a scale factor.) The minimum scale factor is one (which corresponds to folding no gates), a scale factor of three corresponds to folding all gates, and scale factors beyond three fold some or all gates more than once.
For local folding, there is a degree of freedom for which gates to fold first. This order in which gates are folded can affect how the noise is scaled and thus the overall effectiveness of zero-noise extrapolation. Because of this, Mitiq defines several local folding functions in mitiq.zne.scaling, including:

fold_gates_at_random
We explain how these functions work with the following example. We first define a circuit, here in Cirq, which for simplicity creates a Bell state. We can now use a local folding function, e.g. fold_gates_from_left, to fold this circuit. 13 from mitiq . zne import scaling We see that the first Hadamard gate H has been transformed as H → HH † H, to scale the depth of the circuit by a factor of two.
In Mitiq, folding functions do not modify the input circuit. Because of this, we can input the same circuit to fold_gates_from_right to see the effect of this method. 23 folded = scaling . f old_g ates _fro m_rig ht ( 24 circ , scale_factor =2 , 25 ) 26 print ( " Folded circuit : " , folded , sep = " \ n " ) 27 # Folded circuit : Here, we see that the second (CNOT) gate is folded instead of the first (Hadamard) gate, as expected when we start folding from the right (or end) of the circuit instead of the left (or start) of the circuit.
The previous functions fold gates according to the following rules: 1. If the scale factor is an odd integer 1 + 2n, all gates are folded n times.

2.
A generic real scale factor can always be written as λ = 1 + 2(n + δ), where n is an integer and δ < 1. In this case, all gates are folded n times and, moreover, a subset of gates is folded one more time to better approximate the scale factor. The choice of this subset of gates can be random (in fold_gates_at_random) or deterministic (in fold_gates_from_left and fold_gates_from_right).
We emphasize that, although these examples used a Cirq Circuit, circuits can be defined in any supported quantum programming language and the interface is the same as above. In addition to Cirq, Mitiq supports other quantum libraries as listed in table 1. By default, all folding functions return a circuit with the same type as the input circuit.
In the previous examples, each folded gate counts equally in the folded circuit depth. However, this may not be a reasonable assumption for realistic hardware as different gates have different noise levels. Because of this, each folding function in Mitiq supports "folding by fidelity." This works by passing an input dictionary of gate fidelities (either known or estimated) as an optional argument to a folding function. More details on folding by fidelity can be found in Mitiq's documentation.
Finally, we mention global folding. In contrast to local folding which folds subsets of gates, global folding folds the entire circuit until the input scale factor is reached. Below we show an example of global folding using the same Bell state circuit circ defined in Here, we see that the entire Bell state circuit has been folded once to reach the input scale factor of three. If the input scale factor is not reached by an integer number of global folds, fold_global will fold a group of gates from the end of the circuit such that the scale factor is reached.

Parameter-noise scaling
A gate is an abstract elementary operation which, however, is physically implemented as a continuous dynamical evolution. This evolution is generated by a suitable time-dependent control of a Hamiltonian that depends on the details of the hardware. Errors in the calibration of control pulses (e.g. pulse-area errors) or the classical noise affecting their implementation (e.g. electronic noise) can generate a dynamical channel which is different from the desired ideal gate.
In order to mitigate these type of errors, we need a practical way of scaling them. In principle this would require the detailed knowledge of the platform-dependent pulses and Hamiltonians, however, in Mitiq a simplified noise model is used instead. The simplified model is based on the fact that any unitary gate G can always be expressed as G = exp(−iH), for some constant Hamiltonian H = H † (which may be different from the physical one). Therefore, each unitary gate admits a natural parametrization with respect to a real exponent θ: A multi-parameter version of Eq. (4) was considered in [13], but is currently not used in Mitiq. It is also worth to mention that gates are often directly defined in the parametric form of Eq. (4) as, for example, in the case of Pauli rotations. In this setting, a noise model approximately modeling calibration and control errors can be expressed with respect to the classical parameter θ. We can assume that the actual gate is generated by a noisy parameterθ that we can model as a random variable with mean θ and with some variance σ 2 . Noise scaling can be achieved by artificially injecting additional classical noise:θ →θ (λ) =θ +δ (5) whereδ is a random variable with zero mean and variance equal to (1 − λ)σ 2 , such that the resulting noise scaled parameterθ (λ) has mean θ and variance λσ 2 .
In practice, if σ 2 is known for each noisy gate, parameter scaling can be obtained by randomly over-rotating or under-rotating each gate according to the stochastic angles defined in Eq. (4). This noise scaling technique can be applied with Mitiq as shown in the next Codeblock. Codeblock 12: Applying parameter-noise scaling to a quantum circuit. The same base level of noise (base_variance) is assumed for each gate of the circuit.
If the value of the base noise σ 2 is unknown, it needs to be estimated in order to apply this noise scaling method. The function compute_parameter_variance in the sub-module mitiq.zne.scaling can be used for this task. Alternatively, the user may independently perform a custom estimation of σ 2 and only use Mitiq for the noise scaling step described in codeblock 12.
The full application of ZNE obtained via the parameter-noise scaling method is shown in the next Codeblock.

Using noise scaling methods in execute_with_zne
As mentioned in section 2.2, the default noise scaling method in execute_with_zne is fold_gates_at_random. Different methods can be used by passing an optional argument to execute_with_zne. For example, to use global folding, one can do the following. 1 from mitiq . zne import execute_with_zne 2 from mitiq . zne . scaling import fold_global Depending on the noise model of the quantum processor, using a different folding method may better scale the noise and lead to better results.
To end the discussion on noise scaling, we note that some scaling methods are deterministic while some are non-deterministic. In particular, global folding and local folding from left/right return the same folded circuit if the scale factor is the same, but fold_gates_at_random can return different circuits for the same scale factor. Because of this, the function execute_with_zne has another optional argument num_to_average which corresponds to the number of times to compute expectation values at the same scale factor. For example, if num_to_average = 3, the noise scaling method is called three times at each scale factor, and the expectation value at this scale factor is the average over the three runs. Such averaging can smooth out effects due to non-deterministic noise scaling and lead to better results in zero-noise extrapolation. fig. 4(b) uses fold_gates_at_random with num_to_average = 5.

Classical inference: Factory objects
In Mitiq, a Factory object is a self-contained representation of a classical inference technique. In effect, it performs the "extrapolation" part of zero-noise extrapolation. This representation is hardware-agnostic and even quantum-agnostic since it only deals with classical data -namely, the input and output of a noisy computation. The main tasks of a factory are as follows: 1. Compute the expectation value by running an executor function at a given noise level, and record the result; 2. Determine the next noise level at which the expectation value should be computed; 3. Perform classical inference using the history of noise levels and expectation values to compute the zero-noise extrapolated value.
The structure of a Factory is designed to account for adaptive fitting techniques in which the next noise level depends on the history of previous noise levels and expectation values. In Mitiq, (adaptive) fitting techniques in zero-noise extrapolation are represented by specific factory objects. All built-in factories, summarized in table 2, can be imported from the mitiq.zne.inference module.

Using factories in execute_with_zne to perform different extrapolation methods
We now show examples of performing zero-noise extrapolation with fitting techniques defined by factories in table 2. As mentioned in section 2.2, this is done by providing a factory as an optional argument to execute_with_zne. To instantiate a non-adaptive factory, we

Class Extrapolation Method
LinearFactory Extrapolation with a linear fit. RichardsonFactory Richardson extrapolation.
PolyFactory Extrapolation with a polynomial fit.
ExpFactory Extrapolation with an exponential fit. PolyExpFactory Similar to ExpFactory but the exponent can be a nonlinear polynomial. AdaExpFactory Similar to ExpFactory but the noise scale factors are adaptively chosen. input the noise scale factors we want to compute the expectation values at, as shown below for the LinearFactory.
1 from mitiq . zne . inference import LinearFactory Here the scale_factors define the noise levels at which to compute expectation values during zero-noise extrapolation. This factory can now be used as an argument in execute_with_zne as follows. As in section 2.2, the circuit is the quantum program which prepares a state of interest and the executor is a function which executes the circuit and returns the expectation value of an observable. 6 from mitiq . zne import execute_with_zne Instead of the default Richardson extrapolation at noise scale factors 1, 2 and 3, this call to execute_with_zne will perform linear extrapolation at the specified noise scale factors. As mentioned in section 4.1, different noise scaling methods can also be used with the optional argument scale_noise.
Most extrapolation techniques implemented in Mitiq are static (i.e., non-adaptive) and can be instantiated in a similar manner as the LinearFactory. For example, to use a second-order polynomial fit, we use a PolyFactory object as follows.
1 from mitiq . zne import execute_with_zne 2 from mitiq . zne . inference import PolyFactory Other static factories follow similar patterns but may have additional arguments in their constructors. For example, ExpFactory can take in a value for the horizontal asymptote of the exponential fit. For full details, see the Mitiq documentation.
Last, we show an example of an adaptive fitting technique defined by the AdaExpFactory. To use this method (introduced and described in Ref. [13]), we can do the following: 1 from mitiq . zne import execute_with_zne 2 from mitiq . zne . inference import AdaExpFactory Instead of a list of scale factors, here we provide the initial scale factor and the rest are determined adaptively. The number of scale factors determined is equal to the argument steps. Additional arguments which can be passed into the AdaExpFactory are described in the Mitiq documentation. This factory can now be used as an argument in execute_with_zne to use the custom fitting technique. Other fitting techniques can be defined in a similar manner as the code block above.

Probabilistic error cancellation module
Probabilistic error cancellation (PEC) [2,4] is another error mitigation technique which is available in Mitiq. Its workflow is schematically represented in fig. 6: a set of auxiliary circuits are probabilistically sampled, executed on a noisy backend and, eventually, the noisy results are post-processed to infer an error-mitigated expectation value. In principle, this method can probabilistically remove the noise of a quantum computer without additional resources apart from a higher sampling overhead. More information about the advantages and the limitations of PEC is given in section 8.2.
A key step of PEC is to represent each ideal unitary gate G in a circuit as an average over a set of noisy gates which are physically implementable {O α }, weighted by a real quasi-probability distribution η(α): where α η(α) = 1 (trace-preserving condition). The calligraphic operators G and {O α } should be considered as linear super-operators acting on density matrices and not on state vectors [2,4]. If a representation like Eq. (6) is known for each ideal gate of a circuit, then any ideal expectation value can be estimated as a Monte Carlo average over different noisy circuits, each one sampled according to the quasi-probability distributions associated to the ideal gates [2,4]. The real coefficients η(α) which appear in Eq. (6) can be negative for some values of α and, because of this negativity, the required number of Monte Carlo samples can be large [2,4]. In principle, assuming a perfect tomographic knowledge of the noisy gates O α , this method allows for a perfect cancellation of the hardware noise (for a sufficiently large number of samples).
In the remainder of this section, we describe how one can define gate representations and how one can probabilistically sample from them using Mitiq.

Noisy Operations
The r.h.s. of Eq. (6) is a sum over noisy operations O α . A noisy operation is an elementary gate (or a small sequence of gates) acting on specific qubits which can be physically implemented on hardware. To each noisy operation we can associate a (small) QPROGRAM describing the gates to be applied on the physical qubits. Moreover, from a quantum tomography analysis, one can associate to a noisy operation also a numerical matrix representing the completely-positive and trace-preserving channel induced by the operation. In Mitiq, this concept is captured by the NoisyOperation class, which can be initialized as follows: 1 from mitiq . pec . types import NoisyOperation Once the set of all noisy operations {O α } has been defined, we can associate to each operation the corresponding quasi-probability η(α) via a simple Python dictionary: <2 nd noisy operation >: <2 nd real coefficient > , 10 ... 11 } Codeblock 21: Defining a basis expansion as a Python dictionary which associates a real coefficient to each noisy operation.

OperationRepresentation Objects
The dictionary in the previous codeblock 21 completely defines the linear combination in the r.h.s. of Eq. (6) but it contains no information about the l.h.s. of Eq. (6). This motivates the use of an OperationRepresentation class which can be used to store and manipulate all the information which is contained in Eq. (6). Codeblock 22: Initializing an OperationRepresentation object. The first argument is the ideal operation that we want to express as a linear combination of noisy operations. The second argument is the associated basis_expansion which can be defined as shown in codeblock 21.
Given a list of OperationRepresentation objects, associated to all the gates of a circuit of interest, the user can easily apply PEC via the function execute_with_pec as shown in codeblock 7 of section 2.

How to determine the quasi-probability representations?
In practice, depending on how detailed is the knowledge of the hardware noise model, there are two main ways of deriving quasi-probability representations for PEC.

Method 1:
If the hardware noise model is well approximated by a simplified theoretical quantum channel (e.g. depolarizing or amplitude damping), one can typically apply known analytical expressions to compute the quasi-probability representations of arbitrary gates [2]. Method 2: Assuming an over-simplified noise model may be a bad approximation. In this case, the suggested approach is to perform the complete process tomography of a basis set of implementable noisy operations (e.g. the native gate set of the backend). Given the superoperators of the noisy implementable operations, one can obtain the quasi-probability representations as solutions of numerical optimization problems [2]. In Mitiq, this is possible through the find_optimal_representation() function that can be imported from mitiq.pec.representations. An example showing how to use this function is given in the section called What additional options are available in PEC? of the Mitiq documentation (see section 7.2).

Sampling Functions
The function execute_with_pec internally performs the Monte Carlo sampling process which is necessary to estimate an expectation value with PEC. However, the user may be interested in manually sampling gates and circuits for a variety of reasons, e.g., for research purposes, for intermediate manipulations, for efficiency optimizations, etc.
In particular, to sample an implementable NoisyOperation from the quasi-probability distribution of an ideal operation one can do as follows: 18 noisy_operation , sign , eta = operation_rep . sample () Codeblock 23: Sampling an implementable NoisyOperation from the quasi-probability representation of an ideal operation. The quasi-probability representation is given by the OperationRepresentation object defined in codeblock 22. In addition to the sampled noisy_operation, the method sample() returns the associated coefficient (eta) that appears in Eq. (6) and its sign (sign).
Typically, one is interested in sampling an entire implementable circuit from the quasiprobability representation of an ideal circuit. This can be easily achieved via the sample_circuit function, which internally performs repeated calls to the previous sample_sequence function: 19 from mitiq . pec . sampling import sample_circuit Codeblock 24: Sampling an implementable circuit from the quasi-probability representation of an ideal_circuit. Such quasi-probability distribution is implicitly deduced from the input list of OperationRepresentations objects associated to the gates of the input ideal_circuit .

Clifford data regression module
In this section, we present the mitiq.cdr module which implements two recent error mitigation approaches known as Clifford data regression (CDR) and variable noise Clifford data regression (vnCDR) [5,6]. In both techniques, a trained regression model mapping noisy to exact expectation values is used to mitigate the effect of noise on some observable of interest. The model is trained using data produced by the execution of near-Clifford circuits performed on a noisy quantum computer and on a classical simulator. expectation values is generated using near-Clifford circuits which are classically simulable. This data is used to fit a linear ansatz which is then used to estimate the noise-free value for some observable of interest E. We can visualize vnCDR as adding another axis to the training data, along which noise is increased. Diagram modified from [5].

Clifford data regression (CDR)
The Clifford data regression [5] technique uses near-Clifford quantum circuit data to learn a model approximating effects of the noise on an expectation value of an observable E = TrρE for a quantum state ρ given by a quantum circuit of interest. The learned model is used to mitigate the noisy expectation value E(γ 0 ) obtained with a quantum computer with the base noise level γ 0 . The mitigated expectation value E mitigated is obtained using the following procedure:

Variable noise Clifford data regression (vnCDR)
CDR can be generalized to enable learning the noise effects from near-Clifford training circuits simulated at different noise levels λ l . This approach is called variable noise Clifford data regression [6] and can be used to learn a zero-noise extrapolation model for an observable E and a quantum circuit preparing the state ρ. The vnCDR procedure to obtain E mitigated includes evaluation of the training circuits on a quantum computer at different noise rates λ l γ 0 and fitting a extrapolation model: x l a l + b .
4. Use the fitted ansatz to correct the noisy expectation values of E: The default linear ansatz used within Mitiq includes a constant term. Recently this was shown to lead to better mitigated results on real quantum hardware [25].

Applying CDR and vnCDR with Mitiq
Clifford data regression is implemented in Mitiq according to the workflow schematically represented in fig. 8. This error mitigation technique can be applied with the following code Codeblock: Codeblock 25: Applying CDR with Mitiq. The function execute_with_cdr can be used to mitigate errors the expectation values of the input observables. The input executor is a user-defined function for running the input circuit and the associated training circuits on a quantum backend. The input simulator is the ideal counterpart of the noisy executor and is necessary to obtain exact classical simulations of the (near-Clifford) training circuits.
Similarly, variable-noise Clifford data regression can be applied by specifying the optional list of noise scale factors in the function execute_with_cdr. Codeblock 26: Applying vnCDR with Mitiq by calling execute_with_cdr and passing a list of noise scale_factors. Optionally, a noise scaling method can be specified via the argument scale_noise, whose default value is fold_gates_at_random.
One of the key features of both CDR and vnCDR is the construction of a set of classically simulable near-Clifford circuits. At the time of this writing, CDR implemented within Mitiq assumes that the input circuit is pre-compiled in the following gate set {R Z , √ X, CNOT}. This ensures that all the non-Clifford gates are contained in the R Z gates. This is particularly suitable for IBM processors but may be less appropriate for other backends. Different gate sets may be supported in the future.

Additional library information
In this section, we provide technical details and meta-information about the Mitiq library.

Alternative ways of using Mitiq
As we have already shown, errors affecting the estimation of expectation values can be reduced with appropriate functions returning the mitigated expectation value as a real number, e.g. execute_with_zne, execute_with_pec. Here, we show two alternative methods for applying the same error mitigation process. Depending on context, these alternative but equivalent methods may provide a simpler usage.
The first method is provided by the function mitigate_executor which inputs the same arguments as execute_with_* except the quantum circuit. This function returns a new executor which implements error mitigation when it is called with a quantum program, as shown below. The new mitigated_executor performs zero-noise extrapolation when called on a quantum circuit.
The mitigate_executor function can also be imported from other modules in order to apply different techniques. For example, probabilistic error cancellation can be applied after importing mitigate_executor from mitiq.pec.
The second method is to directly decorate the executor function such that it automatically performs error mitigation when called. Also in this case, one should use the decorator corresponding to the desired error mitigation technique, e.g.: zne_decorator, pec_decorator, etc.
1 from mitiq import QPROGRAM 2 from mitiq . zne import zne_decorator  Codeblock 28: Decorating an executor with zne_decorator so that zero-noise extrapolation is implemented when the executor is called on a quantum program In the above Codeblock, the zne_decorator takes the same optional arguments as execute_with_zne. If no optional arguments are used, the decorator should still be called with parentheses, e.g. @zne_decorator().

Mitiq documentation
Mitiq's documentation is hosted online at https://mitiq.readthedocs.io and includes a User's Guide and an API glossary. The User's Guide contains more information on topics covered in this manuscript and additional information on topics not covered here; for example, more examples of executor functions and an advanced usage guide for factory objects. The API glossary is auto-generated from the docstrings (formatted comments to code objects) and contains information about public functions and classes defined in Mitiq.

Contribution guidelines
We welcome contributions to Mitiq from the larger community of quantum software developers. Contributions can come in the form of feedback about the library, feature requests, bug fixes, or pull requests. Feedback and feature requests can be done by opening an issue on the Mitiq GitHub repository. Bug fixes and other pull requests can be done by forking the Mitiq source code, making changes, then opening a pull request to the Mitiq GitHub repository. Pull requests are peer-reviewed by core developers to provide feedback and/or request changes. Contributors are expected to uphold Mitiq development practices including style guidelines and unit tests. More details can be found in the Contribution guidelines documentation.

Discussion
Now that we have described error mitigation techniques in Mitiq and how to use them, we discuss limitations of these techniques as well as the relationship between zero-noise extrapolation, probabilistic error cancellation, and other strategies.

Limitations of zero-noise extrapolation
Zero-noise extrapolation [2,3] is a useful error mitigation technique, but it is not without limitations. First and foremost, the zero-noise estimate is extrapolated, meaning that performance guarantees are quite difficult in general. If a reasonable estimate of how increasing the noise affects the observable (e.g., the blue curves in fig. 3) is known, then ZNE can produce good zero-noise estimates. This is the case for simple noise models such as depolarizing noise, but noise in actual quantum systems is more complicated and can produce different behavior than expected, e.g. fig. 3(b). In this case the performance of ZNE is tied to the performance of the underlying hardware. If expectation values do not produce a smooth curve as noise is increased, the zero-noise estimate may be poor and certain inference techniques may fail. In particular, one has to take into account that any initial error in the measured expectation values will propagate to the zero-noise extrapolation value. This fact can significantly amplify the final estimation uncertainty. In practice, this implies that the evaluation of a mitigated expectation value requires more measurement shots with respect to the unmitigated one.
Additionally, zero-noise extrapolation cannot increase the performance of arbitrary circuits. If the circuit is large enough such that the expectation of the observable is almost constant as noise is increased (e.g., if the state is maximally mixed), then extrapolation will of course not help the zero-noise estimate. The regime in which ZNE is applicable thus depends on the performance of the underlying hardware as well as the circuit. A detailed description of when zero-noise extrapolation is effective, and how effective it is, is the subject of ongoing research.

Limitations of probabilistic error cancellation
The limitations of probabilistic error cancellation [2,4] are similar to those of other error mitigation methods: more circuit executions are necessary compared to the unmitigated case and the method it is not appropriate in the asymptotic regime of many gates or large noise. Compared to ZNE, PEC has the important advantage of producing an unbiased estimation. This means that, if the quasi-probability representations of all the gates are known with sufficiently large accuracy, in the limit of many samples, the PEC estimation converges to the ideal expectation value. Unfortunately, PEC has some practical disadvantages too. The number of samples grows exponentially with respect to the circuit size and to the amount of noise. Moreover, the full tomography of the noisy gates is typically necessary in order to build the quasi-probability representations for the ideal gates. One should also take into account that tomographic errors in the characterization of the hardware gates can propagate through the PEC process inducing a significant error in the final estimation.

Limitations of Clifford data regression
Clifford data regression [5,6] has the promising advantage of being a self-tuning technique since the inference model is not assumed a priori but learned during the training phase. However, this technique presents some limitations as well. The training phase typically introduces a significant overhead (many training circuits must be executed with both quantum and classical hardware). Moreover, the training data is extracted from near-Clifford circuits which may have a different response to the hardware noise compared to the true circuit of interest. It is also worth noting that this technique requires an efficient classical simulator of near-Clifford circuits in addition to a quantum backend.

Overview of error mitigation techniques
Zero-noise extrapolation was first proposed in [2,3] and first demonstrated experimentally in [12]. References [13,26] have extended the noise scaling and extrapolation techniques. Additionally, these references and this paper show experimental demonstrations of zeronoise extrapolation and how it can improve the results of noisy quantum computations.
The purposeful randomization of gates is another approach to quantum error mitigation. Specific techniques include compiling the quantum circuit with twirling gates [10], expressing noiseless gates in a basis of noisy gates as in probabilistic error cancellation [2], and a hybrid proposal improving the scaling of the technique with circuit depth and other resources [4]. Such techniques have been investigated experimentally in trapped ions [18] and superconducting qubits [27] (implementing gate set tomography).
Subspace expansion refers to another set of error mitigation techniques. In Ref. [28], a hybrid quantum-classical hierarchy was introduced, while in Ref. [29], symmetry verification was introduced. It has been demonstrated with a stabilizer-like method [30], exploiting molecular symmetries [11], and with an experiment on a superconducting circuit device [31]. Other symmetry-based protocols have since been proposed [32,33,34]. Other error mitigation techniques include approximating error-correcting codes in quantum channels [35], and have been extended to improve quantum sensing [36], metrology [37], and reduce errors in analog quantum simulation [27].

Differences and relations to neighbouring fields
Quantum error mitigation is deeply connected to quantum error correction and quantum optimal control, two fields of study that also aim at reducing the impact of errors in quantum information processing in quantum computers. More generally, quantum error mitigation is also related to the general theory of open quantum systems. While these are fluid boundaries, it can be useful to point out some differences among these more established fields and the emerging niche of quantum error mitigation.

Quantum error correction
Quantum error correction creates logical qubits out of multiple error-prone physical qubits. After applying logical operations which correspond to the physical operations we want to perform in our circuit, ancilla qubits are measured to diagnose which (if any) errors occurred. Depending on the outcome of these "syndrome measurements", correction operations are performed to remove the errors (if any) that occurred. If the error rate lies below a certain threshold, errors can be actively removed. We can thus say that the goal of error correction is to detect and exactly correct errors, while the goal of error mitigation is to lessen the effect of errors.
The drawback of quantum error correction techniques is that they require a large overhead in terms of additional physical qubits needed to create logical qubits. Current quantum computing devices have been able to demonstrate some components of quantum error correction with a very small number of qubits [38,39]. Indeed, some techniques for quantum error mitigation emerged as more practical quantum error correction solutions [40].

Quantum optimal control
Optimal control theory encompasses a versatile set of techniques that can be applied to many scenarios in quantum technology [41]. It is generally based on a feedback loop between an agent and a target system. A key difference between some quantum error mitigation techniques and quantum optimal control is that the former can be implemented in some instances with post-processing techniques, while the latter relies on an active feedback loop. An example of a specific application of optimal control to quantum dynamics that can be seen as a quantum error mitigation technique is dynamical decoupling [7,8,9]. This technique employs fast control pulses to effectively decouple a system from its environment, with techniques pioneered in the nuclear magnetic resonance community [42]. Quantum optimal control techniques are being integrated into quantum computing software as a means for noise characterization and error mitigation [43].

Environment-induced error protection
More in general, quantum computing devices can be studied in the framework of open quantum systems [44,45,46], that is, systems that exchange energy and information with the surrounding environment in controlled and unwanted ways.
Since errors occur for several reasons in quantum computers, the microscopic description at the physical level can vary broadly, depending on the quantum computing platform that is used as well as the computing architecture, and error mitigation strategies can be employed with an awareness of this variability. For example, superconducting-circuitbased quantum computers have chips that are prone to cross-talk noise [47], while qubits encoded in trapped ions need to be shuttled with electromagnetic pulses, and solid-state artificial atoms, including quantum dots, are heavily affected by inhomogeneous broadening [48]. Considering the physical layer of the actual device [49,50], as well as modeling and adapting the control pulses, can in practice result in more effective error mitigation strategies.
One approach to reduce the impact of noise and errors is to tailor a larger computational space to protect the system from exiting the computational basis. This approach has been particularly fruitful in the context of bosonic quantum codes [51,52,53,54].
Moreover, autonomous error correction approaches have been recently proposed and experimentally verified [55], which exploit the environment to induce error-robust processes. More in general, decoherence-free subspaces have been proposed within the study of Liouvillian dynamics [56,57,58,59].

Conclusion
We have introduced a fully open-source library for quantum error mitigation on nearterm quantum computers. Our library can interface with multiple quantum programming libraries -in particular Cirq, Qiskit, pyQuil, and Braket -and arbitrary quantum processors (real or simulated) available to the user. In this paper, we presented experimental and numerical examples demonstrating how error mitigation can enhance the results of a noisy quantum computation. We then discussed the library in detail, focusing on the specific modules of Mitiq associated to different error mitigation techniques: zero-noise extrapolation, probabilistic error cancellation and Clifford data regression. After mentioning additional software information including support and contribution guidelines, we discussed how the error mitigation techniques in our library relate to other error mitigation techniques as well as quantum error correction, quantum optimal control, and the theory of open quantum systems.
In future work, we plan to incorporate additional error mitigation techniques into the library and to expand the set of benchmarks to better understand when quantum error mitigation is beneficial. Work can also be done to improve the existing modules, for example by implementing different noise-scaling methods, inference techniques, or new error cancellation protocols. One candidate noise-scaling method is pulse stretching which will be possible when pulse-level access to quantum hardware becomes available through more cloud services [60]. A high-level road map for future development which includes more information on these ideas as well as other ideas can be found on the Mitiq Wiki.

A Executor examples
For concreteness, in this appendix we include explicit examples of executor functions which were introduced in section 2.2. As mentioned, an executor always accepts a quantum program, sometimes accepts other arguments, and returns an expectation value as a float.

A.1 Executors based on real hardware
Our first executor is the one used in creating fig. 3(a). This executor runs a two-qubit circuit on an IBMQ quantum processor and returns the probability of the ground state. Codeblock 30: Defining an executor to run on IBMQ and return the probability of the ground state for a two-qubit circuit. Line 2 requires a valid IBMQ account with saved credentials. We assume that the input circuit contains terminal measurements on both qubits.
We also include the same executor function as above but this time running on Rigetti Aspen-8 and used in creating fig. 3(b). Note that this executor requires additional steps compared to the same executor in Qiskit -namely the declaration of classical memory and the addition of measurement operations, as Rigetti QCS handles classical memory different than other platforms. Additionally, it is important to note the use of basic_compile from Mitiq which preserves folded gates when mapping to the native gate set of Aspen-8.
In these examples, we see how the executor function abstracts away details about running on a back-end. This abstraction makes Mitiq compatible with multiple quantum processors using the same interface.

A.2 Executors based on a classical simulator
The executor function does not have to use a real quantum processor but instead can use a classical simulator.In this case, the executor is also responsible for adding noise to the circuit. The manner in which noise is added depends on the quantum programming library being used. We show below an example of an executor which adds depolarizing noise to a Cirq circuit and uses density matrix simulation. This executor inputs an arbitrary observable defined by a cirq.PauliString and returns its expectation value by sampling.  Other noise models can be easily substituted into this executor by changing the channel in Line 13 from cirq.depolarize to a different channel, e.g. cirq.amplitude_damp. Executors using classical simulators in other quantum programming frameworks (e.g., Qiskit or pyQuil) can be defined in an analogous way, although each handles noise in different manners.
Finally, we note that executor functions provided to execute_with_zne must have only a single argument: the quantum program. The examples above include additional arguments, and it is often convenient to write executors this way. To make an executor with multiple arguments a function of one argument, we can use functools.partial as shown below.