Generalization despite overfitting in quantum machine learning models

Evan Peters1,2,3 and Maria Schuld4

1Department of Physics, University of Waterloo, Waterloo, ON, N2L 3G1, Canada
2Institute for Quantum Computing, Waterloo, ON, N2L 3G1, Canada
3Perimeter Institute for Theoretical Physics, Waterloo, Ontario, N2L 2Y5, Canada
4Xanadu, Toronto, ON, M5G 2C8, Canada

Find this paper interesting or want to discuss? Scite or leave a comment on SciRate.


The widespread success of deep neural networks has revealed a surprise in classical machine learning: very complex models often generalize well while simultaneously overfitting training data. This phenomenon of benign overfitting has been studied for a variety of classical models with the goal of better understanding the mechanisms behind deep learning. Characterizing the phenomenon in the context of quantum machine learning might similarly improve our understanding of the relationship between overfitting, overparameterization, and generalization. In this work, we provide a characterization of benign overfitting in quantum models. To do this, we derive the behavior of a classical interpolating Fourier features models for regression on noisy signals, and show how a class of quantum models exhibits analogous features, thereby linking the structure of quantum circuits (such as data-encoding and state preparation operations) to overparameterization and overfitting in quantum models. We intuitively explain these features according to the ability of the quantum model to interpolate noisy data with locally "spiky" behavior and provide a concrete demonstration example of benign overfitting.

► BibTeX data

► References

[1] Michael A Nielsen. ``Neural networks and deep learning''. Determination Press. (2015). url: http:/​/​​.

[2] Stuart Geman, Elie Bienenstock, and René Doursat. ``Neural networks and the bias/​variance dilemma''. Neural Comput. 4, 1–58 (1992).

[3] Trevor Hastie, Robert Tibshirani, Jerome H Friedman, and Jerome H Friedman. ``The elements of statistical learning: data mining, inference, and prediction''. Volume 2. Springer. (2009).

[4] Peter L. Bartlett, Andrea Montanari, and Alexander Rakhlin. ``Deep learning: a statistical viewpoint''. Acta Numerica 30, 87–201 (2021).

[5] Mikhail Belkin. ``Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation''. Acta Numerica 30, 203–248 (2021).

[6] Peter L. Bartlett, Philip M. Long, Gábor Lugosi, and Alexander Tsigler. ``Benign overfitting in linear regression''. Proc. Natl. Acad. Sci. 117, 30063–30070 (2020).

[7] Mikhail Belkin, Daniel Hsu, Siyuan Ma, and Soumik Mandal. ``Reconciling modern machine-learning practice and the classical bias-variance trade-off''. Proc. Natl. Acad. Sci. 116, 15849–15854 (2019).

[8] Mikhail Belkin, Alexander Rakhlin, and Alexandre B. Tsybakov. ``Does data interpolation contradict statistical optimality?''. In Proceedings of Machine Learning Research. Volume 89, pages 1611–1619. PMLR (2019). url: https:/​/​​v89/​belkin19a.html.

[9] Vidya Muthukumar, Kailas Vodrahalli, Vignesh Subramanian, and Anant Sahai. ``Harmless interpolation of noisy data in regression''. IEEE Journal on Selected Areas in Information Theory 1, 67–83 (2020).

[10] Vidya Muthukumar, Adhyyan Narang, Vignesh Subramanian, Mikhail Belkin, Daniel Hsu, and Anant Sahai. ``Classification vs regression in overparameterized regimes: Does the loss function matter?''. J. Mach. Learn. Res. 22, 1–69 (2021). url: http:/​/​​papers/​v22/​20-603.html.

[11] Yehuda Dar, Vidya Muthukumar, and Richard G. Baraniuk. ``A farewell to the bias-variance tradeoff? an overview of the theory of overparameterized machine learning'' (2021). arXiv:2109.02355.

[12] Marcello Benedetti, Erika Lloyd, Stefan Sack, and Mattia Fiorentini. ``Parameterized quantum circuits as machine learning models''. Quantum Sci. Technol. 4, 043001 (2019).

[13] K. Mitarai, M. Negoro, M. Kitagawa, and K. Fujii. ``Quantum circuit learning''. Phys. Rev. A 98, 032309 (2018).

[14] Maria Schuld, Ville Bergholm, Christian Gogolin, Josh Izaac, and Nathan Killoran. ``Evaluating analytic gradients on quantum hardware''. Phys. Rev. A 99, 032331 (2019).

[15] Maria Schuld and Nathan Killoran. ``Quantum machine learning in feature hilbert spaces''. Phys. Rev. Lett. 122, 040504 (2019).

[16] Vojtěch Havlíček, Antonio D. Córcoles, Kristan Temme, Aram W. Harrow, Abhinav Kandala, Jerry M. Chow, and Jay M. Gambetta. ``Supervised learning with quantum-enhanced feature spaces''. Nature 567, 209–212 (2019).

[17] Seth Lloyd and Christian Weedbrook. ``Quantum generative adversarial learning''. Phys. Rev. Lett. 121, 040502 (2018).

[18] Pierre-Luc Dallaire-Demers and Nathan Killoran. ``Quantum generative adversarial networks''. Phys. Rev. A 98, 012324 (2018).

[19] Amira Abbas, David Sutter, Christa Zoufal, Aurelien Lucchi, Alessio Figalli, and Stefan Woerner. ``The power of quantum neural networks''. Nat. Comput. Sci. 1, 403–409 (2021).

[20] Logan G. Wright and Peter L. McMahon. ``The capacity of quantum neural networks''. In 2020 Conference on Lasers and Electro-Optics (CLEO). Pages 1–2. (2020). url: https:/​/​​document/​9193529.

[21] Sukin Sim, Peter D. Johnson, and Alán Aspuru-Guzik. ``Expressibility and entangling capability of parameterized quantum circuits for hybrid quantum-classical algorithms''. Adv. Quantum Technol. 2, 1900070 (2019).

[22] Thomas Hubregtsen, Josef Pichlmeier, Patrick Stecher, and Koen Bertels. ``Evaluation of parameterized quantum circuits: on the relation between classification accuracy, expressibility and entangling capability''. Quantum Mach. Intell. 3, 1 (2021).

[23] Jarrod R McClean, Sergio Boixo, Vadim N Smelyanskiy, Ryan Babbush, and Hartmut Neven. ``Barren plateaus in quantum neural network training landscapes''. Nat. Commun. 9, 4812 (2018).

[24] Marco Cerezo, Akira Sone, Tyler Volkoff, Lukasz Cincio, and Patrick J Coles. ``Cost function dependent barren plateaus in shallow parametrized quantum circuits''. Nat. Commun. 12, 1791 (2021).

[25] Matthias C. Caro, Elies Gil-Fuster, Johannes Jakob Meyer, Jens Eisert, and Ryan Sweke. ``Encoding-dependent generalization bounds for parametrized quantum circuits''. Quantum 5, 582 (2021).

[26] Hsin-Yuan Huang, Michael Broughton, Masoud Mohseni, Ryan Babbush, Sergio Boixo, Hartmut Neven, and Jarrod R McClean. ``Power of data in quantum machine learning''. Nat. Commun. 12, 2631 (2021).

[27] Matthias C. Caro, Hsin-Yuan Huang, M. Cerezo, Kunal Sharma, Andrew Sornborger, Lukasz Cincio, and Patrick J. Coles. ``Generalization in quantum machine learning from few training data''. Nat. Commun. 13, 4919 (2022).

[28] Leonardo Banchi, Jason Pereira, and Stefano Pirandola. ``Generalization in quantum machine learning: A quantum information standpoint''. PRX Quantum 2, 040321 (2021).

[29] Francisco Javier Gil Vidal and Dirk Oliver Theis. ``Input redundancy for parameterized quantum circuits''. Front. Phys. 8, 297 (2020).

[30] Maria Schuld, Ryan Sweke, and Johannes Jakob Meyer. ``Effect of data encoding on the expressive power of variational quantum-machine-learning models''. Phys. Rev. A 103, 032430 (2021).

[31] David Wierichs, Josh Izaac, Cody Wang, and Cedric Yen-Yu Lin. ``General parameter-shift rules for quantum gradients''. Quantum 6, 677 (2022).

[32] Kendall E Atkinson. ``An introduction to numerical analysis''. John Wiley & Sons. (2008).

[33] Ali Rahimi and Benjamin Recht. ``Random features for large-scale kernel machines''. In Advances in Neural Information Processing Systems. Volume 20. (2007). url: https:/​/​​paper_files/​paper/​2007/​hash/​013a006f03dbc5392effeb8f18fda755-Abstract.html.

[34] Walter Rudin. ``The basic theorems of fourier analysis''. John Wiley & Sons, Ltd. (1990).

[35] Song Mei and Andrea Montanari. ``The generalization error of random features regression: Precise asymptotics and the double descent curve''. Commun. Pure Appl. Math. 75, 667–766 (2022).

[36] Trevor Hastie, Andrea Montanari, Saharon Rosset, and Ryan J. Tibshirani. ``Surprises in high-dimensional ridgeless least squares interpolation''. Ann. Stat. 50, 949 – 986 (2022).

[37] Tengyuan Liang, Alexander Rakhlin, and Xiyu Zhai. ``On the multiple descent of minimum-norm interpolants and restricted lower isometry of kernels''. In Proceedings of Machine Learning Research. Volume 125, pages 1–29. PMLR (2020). url: http:/​/​​v125/​liang20a.html.

[38] Edward Farhi and Hartmut Neven. ``Classification with quantum neural networks on near term processors'' (2018). arXiv:1802.06002.

[39] Maria Schuld, Alex Bocharov, Krysta M. Svore, and Nathan Wiebe. ``Circuit-centric quantum classifiers''. Phys. Rev. A 101, 032308 (2020).

[40] Adrián Pérez-Salinas, Alba Cervera-Lierta, Elies Gil-Fuster, and José I. Latorre. ``Data re-uploading for a universal quantum classifier''. Quantum 4, 226 (2020).

[41] Sofiene Jerbi, Lukas J Fiderer, Hendrik Poulsen Nautrup, Jonas M Kübler, Hans J Briegel, and Vedran Dunjko. ``Quantum machine learning beyond kernel methods''. Nat. Commun. 14, 517 (2023).

[42] Casper Gyurik, Dyon Vreumingen, van, and Vedran Dunjko. ``Structural risk minimization for quantum linear classifiers''. Quantum 7, 893 (2023).

[43] Maria Schuld. ``Supervised quantum machine learning models are kernel methods'' (2021). arXiv:2101.11020.

[44] S. Shin, Y. S. Teo, and H. Jeong. ``Exponential data encoding for quantum supervised learning''. Phys. Rev. A 107, 012422 (2023).

[45] Sophie Piccard. ``Sur les ensembles de distances des ensembles de points d'un espace euclidien.''. Memoires de l'Universite de Neuchatel. Secretariat de l'Universite. (1939).

[46] Dave Wecker, Matthew B. Hastings, Nathan Wiebe, Bryan K. Clark, Chetan Nayak, and Matthias Troyer. ``Solving strongly correlated electron models on a quantum computer''. Phys. Rev. A 92, 062318 (2015).

[47] Ian D. Kivlichan, Jarrod McClean, Nathan Wiebe, Craig Gidney, Alán Aspuru-Guzik, Garnet Kin-Lic Chan, and Ryan Babbush. ``Quantum simulation of electronic structure with linear depth and connectivity''. Phys. Rev. Lett. 120, 110501 (2018).

[48] Martín Larocca, Frédéric Sauvage, Faris M. Sbahi, Guillaume Verdon, Patrick J. Coles, and M. Cerezo. ``Group-invariant quantum machine learning''. PRX Quantum 3, 030341 (2022).

[49] Johannes Jakob Meyer, Marian Mularski, Elies Gil-Fuster, Antonio Anna Mele, Francesco Arzani, Alissa Wilms, and Jens Eisert. ``Exploiting symmetry in variational quantum machine learning''. PRX Quantum 4, 010328 (2023).

[50] Martin Larocca, Nathan Ju, Diego García-Martín, Patrick J Coles, and Marco Cerezo. ``Theory of overparametrization in quantum neural networks''. Nat. Comput. Sci. 3, 542–551 (2023).

[51] Yuxuan Du, Min-Hsiu Hsieh, Tongliang Liu, and Dacheng Tao. ``Expressive power of parametrized quantum circuits''. Phys. Rev. Res. 2, 033125 (2020).

[52] Zoë Holmes, Kunal Sharma, M. Cerezo, and Patrick J. Coles. ``Connecting ansatz expressibility to gradient magnitudes and barren plateaus''. PRX Quantum 3, 010313 (2022).

[53] Samson Wang, Enrico Fontana, Marco Cerezo, Kunal Sharma, Akira Sone, Lukasz Cincio, and Patrick J Coles. ``Noise-induced barren plateaus in variational quantum algorithms''. Nat. Commun. 12, 6961 (2021).

[54] Abdulkadir Canatar, Evan Peters, Cengiz Pehlevan, Stefan M. Wild, and Ruslan Shaydulin. ``Bandwidth enables generalization in quantum kernel models''. Transactions on Machine Learning Research (2023). url: https:/​/​​forum?id=A1N2qp4yAq.

[55] Hsin-Yuan Huang, Michael Broughton, Jordan Cotler, Sitan Chen, Jerry Li, Masoud Mohseni, Hartmut Neven, Ryan Babbush, Richard Kueng, John Preskill, and Jarrod R. McClean. ``Quantum advantage in learning from experiments''. Science 376, 1182–1186 (2022).

[56] Sitan Chen, Jordan Cotler, Hsin-Yuan Huang, and Jerry Li. ``Exponential separations between learning with and without quantum memory''. In 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS). Pages 574–585. (2022).

[57] Hsin-Yuan Huang, Richard Kueng, and John Preskill. ``Information-theoretic bounds on quantum advantage in machine learning''. Phys. Rev. Lett. 126, 190505 (2021).

[58] Ville Bergholm, Josh Izaac, Maria Schuld, Christian Gogolin, M. Sohaib Alam, Shahnawaz Ahmed, Juan Miguel Arrazola, Carsten Blank, Alain Delgado, Soran Jahangiri, Keri McKiernan, Johannes Jakob Meyer, Zeyue Niu, Antal Száva, and Nathan Killoran. ``Pennylane: Automatic differentiation of hybrid quantum-classical computations'' (2018). arXiv:1811.04968.

[59] Peter L. Bartlett, Philip M. Long, Gábor Lugosi, and Alexander Tsigler. ``Benign overfitting in linear regression''. Proc. Natl. Acad. Sci. 117, 30063–30070 (2020).

[60] Vladimir Koltchinskii and Karim Lounici. ``Concentration inequalities and moment bounds for sample covariance operators''. Bernoulli 23, 110 – 133 (2017).

[61] Zbigniew Puchała and Jarosław Adam Miszczak. ``Symbolic integration with respect to the haar measure on the unitary group''. Bull. Pol. Acad. Sci. 65, 21–27 (2017).

[62] Daniel A. Roberts and Beni Yoshida. ``Chaos and complexity by design''. J. High Energy Phys. 2017, 121 (2017).

[63] Wallace C. Babcock. ``Intermodulation interference in radio systems frequency of occurrence and control by channel selection''. Bell Syst. tech. j. 32, 63–73 (1953).

[64] M. Atkinson, N. Santoro, and J. Urrutia. ``Integer sets with distinct sums and differences and carrier frequency assignments for nonlinear repeaters''. IEEE Trans. Commun. 34, 614–617 (1986).

[65] J. Robinson and A. Bernstein. ``A class of binary recurrent codes with limited error propagation''. IEEE Trans. Inf. 13, 106–113 (1967).

[66] R. J. F. Fang and W. A. Sandrin. ``Carrier frequency assignment for nonlinear repeaters''. COMSAT Technical Review 7, 227–245 (1977).

Cited by

[1] Mo Kordzanganeh, Daria Kosichkina, and Alexey Melnikov, "Parallel Hybrid Networks: An Interplay between Quantum and Classical Neural Networks", Intelligent Computing 2, 0028 (2023).

[2] Massimiliano Incudini, Michele Grossi, Antonio Mandarino, Sofia Vallecorsa, Alessandra Di Pierro, and David Windridge, "The Quantum Path Kernel: A Generalized Neural Tangent Kernel for Deep Quantum Machine Learning", IEEE Transactions on Quantum Engineering 4, 1 (2023).

[3] Alexey Melnikov, Mohammad Kordzanganeh, Alexander Alodjants, and Ray-Kuang Lee, "Quantum machine learning: from physics to software engineering", Advances in Physics X 8 1, 2165452 (2023).

[4] S. Shin, Y. S. Teo, and H. Jeong, "Exponential data encoding for quantum supervised learning", Physical Review A 107 1, 012422 (2023).

[5] Yuxuan Du, Yibo Yang, Dacheng Tao, and Min-Hsiu Hsieh, "Problem-Dependent Power of Quantum Neural Networks on Multiclass Classification", Physical Review Letters 131 14, 140601 (2023).

[6] Stefano Mangini, "Variational quantum algorithms for machine learning: theory and applications", arXiv:2306.09984, (2023).

[7] Mo Kordzanganeh, Pavel Sekatski, Leonid Fedichkin, and Alexey Melnikov, "An exponentially-growing family of universal quantum circuits", Machine Learning: Science and Technology 4 3, 035036 (2023).

[8] Elies Gil-Fuster, Jens Eisert, and Carlos Bravo-Prieto, "Understanding quantum machine learning also requires rethinking generalization", arXiv:2306.13461, (2023).

[9] Seongwook Shin, Yong Siah Teo, and Hyunseok Jeong, "Dequantizing quantum machine learning models using tensor networks", arXiv:2307.06937, (2023).

[10] Ben Jaderberg, Antonio A. Gentile, Youssef Achari Berrada, Elvira Shishenina, and Vincent E. Elfving, "Let Quantum Neural Networks Choose Their Own Frequencies", arXiv:2309.03279, (2023).

[11] Tobias Haug and M. S. Kim, "Generalization with quantum geometry for learning unitaries", arXiv:2303.13462, (2023).

[12] Jonas Landman, Slimane Thabet, Constantin Dalyac, Hela Mhiri, and Elham Kashefi, "Classically Approximating Variational Quantum Machine Learning with Random Fourier Features", arXiv:2210.13200, (2022).

[13] Berta Casas and Alba Cervera-Lierta, "Multidimensional Fourier series with quantum circuits", Physical Review A 107 6, 062612 (2023).

[14] Jason Iaconis and Sonika Johri, "Tensor Network Based Efficient Quantum Data Loading of Images", arXiv:2310.05897, (2023).

[15] Lucas Slattery, Ruslan Shaydulin, Shouvanik Chakrabarti, Marco Pistoia, Sami Khairy, and Stefan M. Wild, "Numerical evidence against advantage with quantum fidelity kernels on classical data", Physical Review A 107 6, 062417 (2023).

[16] Elies Gil-Fuster, Jens Eisert, and Vedran Dunjko, "On the expressivity of embedding quantum kernels", arXiv:2309.14419, (2023).

[17] Mo Kordzanganeh, Daria Kosichkina, and Alexey Melnikov, "Parallel Hybrid Networks: an interplay between quantum and classical neural networks", arXiv:2303.03227, (2023).

[18] Alice Barthe and Adrián Pérez-Salinas, "Gradients and frequency profiles of quantum re-uploading models", arXiv:2311.10822, (2023).

[19] Aikaterini, Gratsea, and Patrick Huembeli, "The effect of the processing and measurement operators on the expressive power of quantum models", arXiv:2211.03101, (2022).

[20] Shun Okumura and Masayuki Ohzeki, "Fourier coefficient of parameterized quantum circuits and barren plateau problem", arXiv:2309.06740, (2023).

[21] Jorja J. Kirk, Matthew D. Jackson, Daniel J. M. King, Philip Intallura, and Mekena Metcalf, "Emergent Order in Classical Data Representations on Ising Spin Models", arXiv:2303.01461, (2023).

[22] Massimiliano Incudini, Michele Grossi, Antonio Mandarino, Sofia Vallecorsa, Alessandra Di Pierro, and David Windridge, "The Quantum Path Kernel: a Generalized Quantum Neural Tangent Kernel for Deep Quantum Machine Learning", arXiv:2212.11826, (2022).

[23] Mazen Ali and Matthias Kabel, "Piecewise Polynomial Tensor Network Quantum Feature Encoding", arXiv:2402.07671, (2024).

[24] Julian Berberich, Daniel Fink, Daniel Pranjić, Christian Tutschku, and Christian Holm, "Training robust and generalizable quantum models", arXiv:2311.11871, (2023).

[25] Francesco Scala, Andrea Ceschini, Massimo Panella, and Dario Gerace, "A General Approach to Dropout in Quantum Neural Networks", arXiv:2310.04120, (2023).

[26] Richard A. Wolf, "Why we care (about quantum machine learning)", arXiv:2401.07547, (2024).

[27] Sashwat Anagolum, Narges Alavisamani, Poulami Das, Moinuddin Qureshi, Eric Kessler, and Yunong Shi, "Élivágar: Efficient Quantum Circuit Search for Classification", arXiv:2401.09393, (2024).

The above citations are from Crossref's cited-by service (last updated successfully 2024-02-27 20:13:34) and SAO/NASA ADS (last updated successfully 2024-02-27 07:16:35). The list may be incomplete as not all publishers provide suitable and complete citation data.

Could not fetch ADS cited-by data during last attempt 2024-02-27 20:13:34: cURL error 28: Operation timed out after 10001 milliseconds with 0 bytes received