An Effective Framework for Full-Stack Benchmarking of Quantum Computers

This is a Perspective on "Application-Motivated, Holistic Benchmarking of a Full Quantum Computing Stack" by Daniel Mills, Seyon Sivarajah, Travis L. Scholten, and Ross Duncan, published in Quantum 5, 415 (2021).

By Michele Amoretti (Department of Engineering and Architecture - University of Parma, Italy and Quantum Information Science @ University of Parma, Italy).

Current noisy intermediate-scale quantum (NISQ) computers have not yet converged on a specific candidate device technology. Superducting transmon qubits [1,2] and trapped ions [3,4] are front-running technologies that have allowed experimental demonstration of quantum gates and algorithms of ever-increasing quality. Testing the physical properties of qubits uses different metrics depending on the considered technology. However, commercial-grade programmable systems [5,7,6] can be benchmarked in a way that is both universal across platforms and agnostic to the differences in the underlying hardware.

Benchmarking with quantum algorithms exercises the full hardware/software stack. A hardware-specific compiler breaks down algorithms into the target hardware’s native gate set, optimizing for qubit connectivity, gate times, and coherence [8] to enhance the system’s performance. After execution on the hardware, the measurements can be directly compared with the expected output to determine the accuracy of the device. This accuracy can then be compared with other devices that have compiled and run the same algorithm [9].

In a recent work, now published in Quantum, Mills et al. [10] presented a framework for application-motivated benchmarking of full quantum computing stacks. The benchmarks defined there have a circuit class, describing the type of circuit to be run on the system, and a figure of merit, quantifying how well the system did when running circuits from that class. The idea is that a circuit class should represent a particular application domain. The application-motivated circuit classes proposed by Mills et al. (denoted as deep, shallow and square) draw inspiration from quantum algorithmic primitives [11] and from the literature on near-term quantum computing applications (e.g., machine learning and chemistry). On the other hand, all the considered figures of merit have an important feature: they can be approximated with a polynomial number of samples in the size of the circuit.

Deep circuits are constructed from several layers of Pauli gadgets [12], which are quantum circuits that implement an operation corresponding to exponentiating a tensor product of Pauli matrices. For example, in quantum chemistry, deep circuits are used to build UCC trial states used in the Variational Quantum Eigensolver (VQE) [13].

Square circuits are random circuits built from two-qubit gates. They provide a benchmark at all layers of the quantum computing stack. Indeed, they have been suggested as a means to demonstrate quantum computational supremacy [14]. Square circuits avoid favoring any device in particular, because they allow two-qubit gates to act between any pair of qubits in the uncompiled circuit.

The class of shallow circuits is a subclass of Instantaneous Quantum Polytime (IQP) circuits [15]. IQP circuits consist of gates diagonal in the Pauli-Z basis, sandwiched between two layers of Hadamard gates acting on all qubits. Shallow circuits are characterized by limited connectivity between the qubits and by a depth that increases slowly with width. Thus, shallow circuits are useful for understanding the performance of a device being utilized for applications whose circuit depth grows less quickly than their qubit requirement.

Let us denote an $n$-qubit circuit as $C$, the ideal output distribution of $C$ as $p_C$ and the output distribution produced by the compiled implementation of $C$ as $D_C$. In the following, we summarize how the figures of merit considered by Mills et al. can be estimated from samples drawn from $D_C$.

  • Heavy Output Generation (HOG) [16] – An output $z \in \{0,1\}^n$ is heavy for a quantum circuit $C$, if $p_C(z)$ is greater than the median of the set $\{p_C(x) : x \in \{0,1\}^n\}$. The HOG probability of $D_C$, i.e., the probability that samples drawn from $D_C$ will be heavy outputs in $p_C$, is
    \text{HOG}(D_C,p_C) = \sum_{x \in \{0,1\}^n} D_C(x) \delta_C(x)
    where $\delta_c(x) = 1$ if $x$ is heavy for $C$, and $\delta_c(x) = 0$ otherwise. It is desired to have HOG$(D_C,p_C) > 1/2$, as it would help us distinguish between a good implementation of $C$ and an attempt to mimic it by generating random bitstrings. Of course this is true if $p_C$ is sufficiently far from uniform.
  • Cross-Entropy Difference [14] – The cross-entropy difference $CED$ tells us whether the distribution $D’$ is best predicted by $D$ or by the uniform distribution $U$. The cross-entropy between $D_C$ and $p_C$ can be estimated as
    \text{CE}(D_C,p_C) = \frac{1}{k} \sum_{1 = 1,..,k} \log \left( \frac{1}{p_C(x_i)} \right)
    where $x_1,..,x_k$ are samples drawn from $D_C$. This can be used to approximate the value of $CED(D_C,p_C)$, by means of a classical computer. The $CED$ gives a value between 0, corresponding to $D_C = U$, and 1, corresponding to $D_C = p_C$. Thus, it is desired to have $CED(D_C,p_C) = 1$.
  • $l_1$-norm distance – The $l_1$-norm distance between $D_C$ and $p_C$ is
    l_1(D_C,p_C) = \sum_{x \in \{0,1\}^n} |D_C(x) – p_C(x)|.
    It is desired to have $l_1(D_C,p_C) = 0$.

In their work, Mills et al. used the proposed circuits and figures of merit to i) study the compilation strategy and the device on which the compiled circuit is run (full stack benchmarking), ii) demonstrate that the same quantum computing stack performs differently when running different applications (application-motivated benchmarks), and iii) show how application-motivated benchmarks can be helpful in identifying whether new noise channels should be incorporated into a noise model.

The perspective taken and the framework introduced by Mills et al. are timely and provide a means to select the best quantum computing stack, for a particular task, and vice versa. Among the future directions suggested by the authors of the paper, incorporating the proposed benchmarks into a compilation strategy is undoubtedly very appealing, especially if the compiler is noise-adaptive.

► BibTeX data

► References

[1] C. Rigetti, J. M. Gambetta, S. Poletto, B. L. T. Plourde, J. M. Chow, A. D. Córcoles, J. A. Smolin, S. T. Merkel, J. R. Rozen, G. A. Keefe, M. B. Rothwell, M. B. Ketchen, and M. Steffen. Superconducting qubit in a waveguide cavity with a coherence time approaching 0.1 ms. Phys. Rev. B, 86:100506, Sep 2012. 10.1103/​PhysRevB.86.100506.

[2] P. Klimov, J. Kelly, J. Chen, M. Neeley, A. Megrant, B. Burkett, R. Barends, K. Arya, B. Chiaro, Y. Chen, A. Dunsworth, A. Fowler, B. Foxen, C. M. Gidney, M. Giustina, R. Graff, T. Huang, E. Jeffrey, E. Lucero, J. Mutus, O. Naaman, C. Neill, C. Quintana, P. Roushan, D. Sank, A. Vainsencher, J. Wenner, T. White, S. Boixo, R. Babbush, V. Smelyanskiy, H. Neven, and J. Martinis. Fluctuations of Energy-Relaxation Times in Superconducting Qubits. Phys. Rev. Lett., 121:090502, 2018. 10.1103/​PhysRevLett.121.090502.

[3] T. P. Harty, D. T. C. Allcock, C. J. Ballance, L. Guidoni, H. A. Janacek, N. M. Linke, D. N. Stacey, and D. M. Lucas. High-Fidelity Preparation, Gates, Memory, and Readout of a Trapped-Ion Quantum Bit. Phys. Rev. Lett., 113:220501, 2014. 10.1103/​PhysRevLett.113.220501.

[4] S. Debnath, N. M. Linke, C. Figgatt, K. A. Landsman, K. Wright, and C. Monroe. Demonstration of a small programmable quantum computer with atomic qubits. Nature, 536:63–66, 2016. 10.1038/​nature18648.

[5] IBM Quantum. https:/​/​​quantum-computing/​.

[6] IonQ. https:/​/​​.

[7] Rigetti Computing. https:/​/​​.

[8] N. M. Linke, D. Maslov, M. Roetteler, S. Debnath, C. Figgatt, K. A. Landsman, K. Wright, and C. Monroe. Experimental comparison of two quantum computing architectures. Proc. Natl. Acad. Sci. U.S.A., 114(13):3305–3310, 2017. 10.1073/​pnas.1618020114.

[9] P. Murali, N. M. Linke, M. Martonosi, A. J. Abhari, N. H. Nguyen, and C. H. Alderete. Full-stack, real-system quantum computer studies: Architectural comparisons and design insights. In Proceedings of the 46th International Symposium on Computer Architecture, ISCA '19, page 527–540, New York, NY, USA, 2019. Association for Computing Machinery. 10.1145/​3307650.3322273.

[10] D. Mills, S. Sivarajah, T. Scholten, and R. Duncan. Application-Motivated, Holistic Benchmarking of a Full Quantum Computing Stack. Quantum, 5:415, 2021. 10.22331/​q-2021-03-22-415.

[11] R. Blume-Kohout and T. Young. A volumetric framework for quantum computer benchmarks. Quantum, 4:362, 2019. 10.22331/​q-2020-11-15-362.

[12] A. Cowtan, S. Dilkes, R. Duncan, W. Simmons, and S. Sivarajah. Phase Gadget Synthesis for Shallow Circuits. In Quantum Physics and Logic 2019 (QPL), pages 213–218, 2020. 10.4204/​EPTCS.318.13.

[13] P. K. Barkoutsos, J. F. Gonthier, I. Sokolov, N. Moll, G. Salis, A. Fuhrer, M. Ganzhorn, D. J. Egger, M. Troyer, A. Mezzacapo, S. Filipp, and I. Tavernelli. Quantum algorithms for electronic structure calculations: Particle-hole hamiltonian and optimized wave-function expansions. Phys. Rev. A, 98:022322, 2018. 10.1103/​PhysRevA.98.022322.

[14] S. Boixo, S. Isakov, V. Smelyanskiy, R. Babbush, N. Ding, Z. Jiang, M. J. Bremner, J. Martinis, and H. Neven. Characterizing quantum supremacy in near-term devices. Nat. Phys., 14:595–600, 2018. 10.1038/​s41567-018-0124-x.

[15] D. Shepherd and M. J. Bremner. Temporally unstructured quantum computation. Proc. Math. Phys. Eng. Sci., 465(2105):1413–1439, 2009. 10.1098/​rspa.2008.0443.

[16] S. Aaronson and L. Chen. Complexity-theoretic foundations of quantum supremacy experiments. In 32nd Computational Complexity Conference, pages 1–67, 2017. 10.4230/​LIPIcs.CCC.2017.22.

Cited by

On Crossref's cited-by service no data on citing works was found (last attempt 2021-05-06 18:39:30). On SAO/NASA ADS no data on citing works was found (last attempt 2021-05-06 18:39:31).