Encoding-dependent generalization bounds for parametrized quantum circuits

Matthias C. Caro1,2, Elies Gil-Fuster3,4, Johannes Jakob Meyer3, Jens Eisert3,4,5, and Ryan Sweke3

1Department of Mathematics, Technical University of Munich, 85748 Garching, Germany
2Munich Center for Quantum Science and Technology (MCQST), 80799 Munich, Germany
3Dahlem Center for Complex Quantum Systems, Freie Universität Berlin, 14195 Berlin, Germany
4Fraunhofer Heinrich Hertz Institute, 10587 Berlin, Germany
5Helmholtz-Zentrum Berlin für Materialien und Energie, 14109 Berlin, Germany

Find this paper interesting or want to discuss? Scite or leave a comment on SciRate.

Abstract

A large body of recent work has begun to explore the potential of parametrized quantum circuits (PQCs) as machine learning models, within the framework of hybrid quantum-classical optimization. In particular, theoretical guarantees on the out-of-sample performance of such models, in terms of generalization bounds, have emerged. However, none of these generalization bounds depend explicitly on how the classical input data is encoded into the PQC. We derive generalization bounds for PQC-based models that depend explicitly on the strategy used for data-encoding. These imply bounds on the performance of trained PQC-based models on unseen data. Moreover, our results facilitate the selection of optimal data-encoding strategies via structural risk minimization, a mathematically rigorous framework for model selection. We obtain our generalization bounds by bounding the complexity of PQC-based models as measured by the Rademacher complexity and the metric entropy, two complexity measures from statistical learning theory. To achieve this, we rely on a representation of PQC-based models via trigonometric functions. Our generalization bounds emphasize the importance of well-considered data-encoding strategies for PQC-based models.

In a typical supervised machine learning problem, one is given a fixed-size dataset consisting of labelled objects. For example, the dataset might contain images labelled as dogs or cats. The goal is then to use the dataset to learn some classification algorithm, which when given appropriate new objects – ones not already present in the dataset but of the same type – can label them correctly. Ideally, once one has obtained a classification algorithm, one would like to be able to have some sort of guarantee on how well this classification algorithm will perform – i.e., on how accurate it will be on previously unseen objects. Such guarantees on the “out-of-sample" performance of classification algorithms – i.e., on their accuracy on previously unseen data – are known as generalization bounds. Over time though it has been realized that various notions of complexity are critical for determining how well a class of classification algorithms will generalize. Intuitively, the more complex a classification algorithm can be, the more likely it is to "overfit" the available data, and therefore generalize badly. Indeed, we now know that one can derive a variety of generalization bounds for classification algorithms, which allow us to place guarantees on the "out-of-sample" performance of classification algorithms, in terms of their performance on the available data, and the complexity of the class of algorithms that they belong to.

Given the emerging availability of "noisy intermediate scale quantum" devices, a variety of approaches have been proposed for using such devices in supervised learning. One particularly important way in which these approaches differ, is in the way the classical data is entered into the quantum classification algorithm. In light of this, and given the importance of generalization bounds, it is natural to ask whether one can determine how the appropriate notions of complexity for generalization depend on the different strategies for encoding classical data into the quantum classifier.

In this work, we answer this question by quantifying rigorously how various notions of complexity depend on the strategy for encoding data into a quantum classifier. These results immediately allow us to state generalization bounds for such quantum classification algorithms. With these, we can give concrete recommendations for choosing classical-to-quantum encoding strategies.

► BibTeX data

► References

[1] Vedran Dunjkoand Hans J Briegel ``Machine learning & artificial intelligence in the quantum domain: A review of recent progress'' Rep. Prog. Phys. 81, 074001 (2018).
https:/​/​doi.org/​10.1088/​1361-6633/​aab406

[2] J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, N. Wiebe, and S. Lloyd, ``Quantum machine learning'' Nature 549, 195–202 (2017).
https:/​/​doi.org/​10.1038/​nature23474

[3] S. Arunachalamand R. de Wolf ``Guest Column: A Survey of Quantum Learning Theory'' SIGACT News 48 (2017).
https:/​/​doi.org/​10.1145/​3106700.3106710

[4] N. Wiebe, A. Kapoor, and K. Svore, ``Quantum algorithms for nearest-neighbor methods for supervised and unsupervised learning'' Quant. Inf. Comp. 15, 0318 (2015).
https:/​/​doi.org/​10.5555/​2871393.2871400

[5] S. Lloyd, M. Mohseni, and P. Rebentrost, ``Quantum algorithms for supervised and unsupervised machine learning'' arXiv:1307.0411 (2013).

[6] R. Sweke, J.-P. Seifert, D. Hangleiter, and J. Eisert, ``On the quantum versus classical learnability of discrete distributions'' Quantum 5, 417 (2021).
https:/​/​doi.org/​10.22331/​q-2021-03-23-417

[7] Yunchao Liu, Srinivasan Arunachalam, and Kristan Temme, ``A rigorous and robust quantum speed-up in supervised machine learning'' Nature Physics 17, 1013–1017 (2021) Bandiera_abtest: a Cg_type: Nature Research Journals Number: 9 Primary_atype: Research Publisher: Nature Publishing Group Subject_term: Computational science;Information theory and computation;Quantum information Subject_term_id: computational-science;information-theory-and-computation;quantum-information.
https:/​/​doi.org/​10.1038/​s41567-021-01287-z
https:/​/​www.nature.com/​articles/​s41567-021-01287-z

[8] F. Arute ``Quantum supremacy using a programmable superconducting processor'' Nature 574, 505–510 (2019).
https:/​/​doi.org/​10.1038/​s41586-019-1666-5

[9] K. Bharti, A. Cervera-Lierta, T. H. Kyaw, T. Haug, S. Alperin-Lea, A. Anand, M. Degroote, H. Heimonen, J. S. Kottmann, T. Menke, W.-K. Mok, S. Sim, L.-C. Kwek, and A. Aspuru-Guzik, ``Noisy intermediate-scale quantum (NISQ) algorithms'' arXiv:2101.08448 (2021).

[10] J. R. McClean, J. Romero, R. Babbush, and A. Aspuru-Guzik, ``The theory of variational hybrid quantum-classical algorithms'' New J. Phys. 18, 023023 (2016).
https:/​/​doi.org/​10.1088/​1367-2630/​18/​2/​023023

[11] M. Cerezo, Andrew Arrasmith, Ryan Babbush, Simon C. Benjamin, Suguru Endo, Keisuke Fujii, Jarrod R. McClean, Kosuke Mitarai, Xiao Yuan, Lukasz Cincio, and Patrick J. Coles, ``Variational quantum algorithms'' Nature Reviews Physics 3, 625–644 (2021) Bandiera_abtest: a Cg_type: Nature Research Journals Number: 9 Primary_atype: Reviews Publisher: Nature Publishing Group Subject_term: Computer science;Quantum information;Quantum simulation Subject_term_id: computer-science;quantum-information;quantum-simulation.
https:/​/​doi.org/​10.1038/​s42254-021-00348-9
https:/​/​www.nature.com/​articles/​s42254-021-00348-9

[12] Marcello Benedetti, Erika Lloyd, Stefan Sack, and Mattia Fiorentini, ``Parameterized quantum circuits as machine learning models'' Quant. Sc. Tech. 4, 043001 (2019).
https:/​/​doi.org/​10.1088/​2058-9565/​ab4eb5

[13] S. Sim, P. D. Johnson, and A. Aspuru-Guzik, ``Expressibility and entangling capability of parameterized quantum circuits for hybrid quantum-classical algorithms'' Adv. Quant. Tech. 2, 1900070 (2019).
https:/​/​doi.org/​10.1002/​qute.201900070

[14] Thomas Hubregtsen, Josef Pichlmeier, Patrick Stecher, and Koen Bertels, ``Evaluation of parameterized quantum circuits: on the relation between classification accuracy, expressibility, and entangling capability'' Quant. Mach. Int. 3, 1–19 (2021).
https:/​/​doi.org/​10.1007/​s42484-021-00038-w

[15] F. J. Gil Vidaland D. O. Theis ``Input Redundancy for Parameterized Quantum Circuits'' Front. Phys. 8 (2020) Publisher: Frontier.
https:/​/​doi.org/​10.3389/​fphy.2020.00297

[16] M. Schuld ``Quantum machine learning models are kernel methods'' arXiv:2101.11020 (2021).

[17] A. Pérez-Salinas, A. Cervera-Lierta, E. Gil-Fuster, and J. I. Latorre, ``Data re-uploading for a universal quantum classifier'' Quantum 4, 226 (2020).
https:/​/​doi.org/​10.22331/​q-2020-02-06-226

[18] C. Bishop ``Pattern recognition and machine learning'' Springer (2006).

[19] B. Schölkopfand A. J. Smola ``Learning with kernels: support vector machines, regularization, optimization, and beyond'' MIT Press (2002).

[20] M. Mohri, A. Rostamizadeh, and A. Talwalkar, ``Foundations of machine learning'' MIT Press (2018).

[21] Olivier Bousquetand André Elisseeff ``Stability and generalization'' J. Mach. Learn. Res. 2, 499–526 (2002).
https:/​/​doi.org/​10.1162/​153244302760200704

[22] Nick Littlestoneand Manfred Warmuth ``Relating data compression and learnability'' Technical report, University of California Santa Cruz (1986).

[23] David A McAllester ``Some pac-bayesian theorems'' Machine Learning 37, 355–363 (1999).
https:/​/​doi.org/​10.1023/​A:1007618624809

[24] M. C. Caroand I. Datta ``Pseudo-dimension of quantum circuits'' Quant. Mach. Int. 2, 172 (2020).
https:/​/​doi.org/​10.1007/​s42484-020-00027-5

[25] Amira Abbas, David Sutter, Christa Zoufal, Aurélien Lucchi, Alessio Figalli, and Stefan Woerner, ``The power of quantum neural networks'' Nature Computational Science 1, 403–409 (2021).
https:/​/​doi.org/​10.1038/​s43588-021-00084-1

[26] K. Bu, D. E. Koh, L. Li, Q. Luo, and Y. Zhang, ``On the statistical complexity of quantum circuits'' arXiv:2101.06154 (2021).

[27] K. Bu, D. E. Koh, L. Li, Q. Luo, and Y. Zhang, ``Effects of quantum resources on the statistical complexity of quantum circuits'' arXiv:2102.03282 (2021).

[28] K. Bu, D. E. Koh, L. L., Q. Luo, and Y. Zhang, ``Rademacher complexity of noisy quantum circuits'' arXiv:2103.03139 (2021).

[29] Y. Du, Z. Tu, X. Yuan, and D. Tao, ``An efficient measure for the expressivity of variational quantum algorithms'' arXiv:2104.09961 (2021).

[30] H.-Y. Huang, M. Broughton, M. Mohseni, R. Babbush, S. Boixo, H. Neven, and J. R. McClean, ``Power of data in quantum machine learning'' Nature Comm. 12, 1–9 (2021).
https:/​/​doi.org/​10.1038/​s41467-021-22539-9

[31] L. Banchi, J. Pereira, and S. Pirandola, ``Generalization in quantum machine learning: A quantum information perspective'' arXiv:2102.08991 (2021).

[32] C. Gyurik, D. van Vreumingen, and V. Dunjko, ``Structural risk minimization for quantum linear classifiers'' arXiv:2105.05566 (2021).

[33] M. Schuld, R. Sweke, and J. J. Meyer, ``Effect of data encoding on the expressive power of variational quantum-machine-learning models'' Phys. Rev. A 103, 032430 (2021).
https:/​/​doi.org/​10.1103/​PhysRevA.103.032430

[34] S. Shalev-Shwartzand S. Ben-David ``Understanding machine learning: From theory to algorithms'' Cambridge University Press (2014).
https:/​/​doi.org/​10.1017/​CBO9781107298019

[35] M. M Wolf ``Mathematical foundations of machine learning'' (2020).

[36] Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, and Oriol Vinyals, ``Understanding deep learning requires rethinking generalization'' arXiv:1611.03530 (2016).

[37] Y. Jiang, B. Neyshabur, H. Mobahi, D. Krishnan, and S. Bengio, ``Fantastic generalization measures and where to find them'' arXiv:1912.02178 (2019).

[38] V. N. Vapnikand A. Ya. Chervonenkis ``On the uniform convergence of relative frequencies of events to their probabilities'' Th. Prob. App. 16, 264–280 (1971).
https:/​/​doi.org/​10.1137/​1116025

[39] P. L. Bartlettand S. Mendelson ``Rademacher and Gaussian complexities: Risk bounds and structural results'' J. Mach. Learn. Res. 3, 463–482 (2002).
https:/​/​doi.org/​10.5555/​944919.944944

[40] D. Pollard ``Convergence of stochastic processes'' Springer (1984).
https:/​/​doi.org/​10.1007/​978-1-4612-5254-2

[41] H. Abraham ``Qiskit: An Open-source Framework for Quantum Computing'' (2019) 10.5281/​zenodo.2562110.
https:/​/​doi.org/​10.5281/​zenodo.2562110

[42] Cirq Developers ``Cirq'' (2021) https:/​/​zenodo.org/​record/​4586899#.YJqDAi5fiUk.
https:/​/​doi.org/​10.5281/​zenodo.4586899

[43] V. Bergholm, J. Izaac, M. Schuld, C. Gogolin, M. S. Alam, S. Ahmed, J. M. Arrazola, C. Blank, A. Delgado, S. Jahangiri, K. McKiernan, J. J. Meyer, Z. Niu, A. Száva, and N. Killoran, ``PennyLane: Automatic differentiation of hybrid quantum-classical computations'' arXiv:1811.04968 (2020).

[44] Claudiu Marius Popescu ``Learning bounds for quantum circuits in the agnostic setting'' Quantum Information Processing 20, 1–24 (2021).
https:/​/​doi.org/​10.1007/​s11128-021-03225-7

[45] Chih-Chieh Chen, Masaya Watabe, Kodai Shiba, Masaru Sogabe, Katsuyoshi Sakamoto, and Tomah Sogabe, ``On the Expressibility and Overfitting of Quantum Circuit Learning'' ACM Transactions on Quantum Computing 2, 1–24 (2021).
https:/​/​doi.org/​10.1145/​3466797

[46] M. Ledouxand M. Talagrand ``Probability in Banach spaces: Isoperimetry and processes'' Springer-Verlag (1991).
https:/​/​doi.org/​10.1007/​978-3-642-20212-4

[47] R. Vershynin ``High-dimensional probability: An introduction with applications in data science'' Cambridge University Press (2018).
https:/​/​doi.org/​10.1017/​9781108231596

[48] P. Massart ``Some applications of concentration inequalities to statistics'' Annales de la Faculté des sciences de Toulouse : Mathématiques Ser. 6, 9, 245–303 (2000).
https:/​/​doi.org/​10.5802/​afst.961

[49] R. M. Dudley ``Uniform central limit theorems'' Cambridge University Press (1999).
https:/​/​doi.org/​10.1017/​CBO9780511665622

Cited by

[1] Yuxuan Du, Min-Hsiu Hsieh, Tongliang Liu, Shan You, and Dacheng Tao, "Learnability of Quantum Neural Networks", PRX Quantum 2 4, 040337 (2021).

[2] Matthias C. Caro, Hsin-Yuan Huang, M. Cerezo, Kunal Sharma, Andrew Sornborger, Lukasz Cincio, and Patrick J. Coles, "Generalization in quantum machine learning from few training data", arXiv:2111.05292.

[3] Sofiene Jerbi, Lukas J. Fiderer, Hendrik Poulsen Nautrup, Jonas M. Kübler, Hans J. Briegel, and Vedran Dunjko, "Quantum machine learning beyond kernel methods", arXiv:2110.13162.

[4] Beng Yee Gan, Daniel Leykam, and Dimitris G. Angelakis, "Fock State-enhanced Expressivity of Quantum Machine Learning Models", arXiv:2107.05224.

[5] Yuxuan Du, Yang Qian, and Dacheng Tao, "Accelerating variational quantum algorithms with multiple quantum processors", arXiv:2106.12819.

The above citations are from SAO/NASA ADS (last updated successfully 2021-12-08 01:38:47). The list may be incomplete as not all publishers provide suitable and complete citation data.

On Crossref's cited-by service no data on citing works was found (last attempt 2021-12-08 01:38:45).