Effect of barren plateaus on gradient-free optimization

Andrew Arrasmith1, M. Cerezo1,2, Piotr Czarnik1, Lukasz Cincio1, and Patrick J. Coles1

1Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
2Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, NM, USA

Find this paper interesting or want to discuss? Scite or leave a comment on SciRate.

Abstract

Barren plateau landscapes correspond to gradients that vanish exponentially in the number of qubits. Such landscapes have been demonstrated for variational quantum algorithms and quantum neural networks with either deep circuits or global cost functions. For obvious reasons, it is expected that gradient-based optimizers will be significantly affected by barren plateaus. However, whether or not gradient-free optimizers are impacted is a topic of debate, with some arguing that gradient-free approaches are unaffected by barren plateaus. Here we show that, indeed, gradient-free optimizers do not solve the barren plateau problem. Our main result proves that cost function differences, which are the basis for making decisions in a gradient-free optimization, are exponentially suppressed in a barren plateau. Hence, without exponential precision, gradient-free optimizers will not make progress in the optimization. We numerically confirm this by training in a barren plateau with several gradient-free optimizers (Nelder-Mead, Powell, and COBYLA algorithms), and show that the numbers of shots required in the optimization grows exponentially with the number of qubits.

Quantum machine learning and variational quantum algorithms provide the potential for near-term, practical applications of quantum computers. However, these techniques require the use of a classical optimization loop that can, in some cases, prove prohibitive. In particular, some of these optimization landscapes present barren plateaus, where the gradient of the cost function is exponentially suppressed except within an exponentially small region. As a result, optimization on a barren plateau landscape requires computational resources that scale exponentially with the number of qubits used.

Perhaps because of the mention of gradients in the definition of a barren plateau, a number of researchers have postulated that a gradient free optimizer might be able to beat the resource scaling. This thought turns out to be incorrect. Here we provide a proof that no gradient free strategy (including models that build surrogate gradients) can escape this resource scaling in barren plateau landscapes.

► BibTeX data

► References

[1] M. Cerezo, A. Arrasmith, R. Babbush, S. C. Benjamin, S. Endo, K. Fujii, J. R. McClean, K. Mitarai, X. Yuan, L. Cincio, and P. J. Coles, Variational quantum algorithms, Nature Reviews Physics 1, 19 (2021a).
https:/​/​doi.org/​10.1038/​s42254-021-00348-9

[2] K. Bharti, A. Cervera-Lierta, T. H. Kyaw, T. Haug, S. Alperin-Lea, A. Anand, M. Degroote, H. Heimonen, J. S. Kottmann, T. Menke, W.-K. Mok, S. Sim, L.-C. Kwek, and A. Aspuru-Guzik, Noisy intermediate-scale quantum (nisq) algorithms, arXiv preprint arXiv:2101.08448 (2021).
arXiv:2101.08448

[3] A. Peruzzo, J. McClean, P. Shadbolt, M.-H. Yung, X.-Q. Zhou, P. J. Love, A. Aspuru-Guzik, and J. L. O’brien, A variational eigenvalue solver on a photonic quantum processor, Nature communications 5, 1 (2014).
https:/​/​doi.org/​10.1038/​ncomms5213

[4] J. R. McClean, J. Romero, R. Babbush, and A. Aspuru-Guzik, The theory of variational hybrid quantum-classical algorithms, New Journal of Physics 18, 023023 (2016).
https:/​/​doi.org/​10.1007/​978-94-015-8330-5_4

[5] E. Farhi, J. Goldstone, and S. Gutmann, A quantum approximate optimization algorithm, arXiv preprint arXiv:1411.4028 (2014).
arXiv:1411.4028

[6] J. Romero, J. P. Olson, and A. Aspuru-Guzik, Quantum autoencoders for efficient compression of quantum data, Quantum Science and Technology 2, 045001 (2017).
https:/​/​doi.org/​10.1088/​2058-9565/​aa8072

[7] S. Khatri, R. LaRose, A. Poremba, L. Cincio, A. T. Sornborger, and P. J. Coles, Quantum-assisted quantum compiling, Quantum 3, 140 (2019).
https:/​/​doi.org/​10.22331/​q-2019-05-13-140

[8] R. LaRose, A. Tikku, É. O'Neel-Judy, L. Cincio, and P. J. Coles, Variational quantum state diagonalization, npj Quantum Information 5, 1 (2019).
https:/​/​doi.org/​10.1038/​s41534-019-0167-6

[9] A. Arrasmith, L. Cincio, A. T. Sornborger, W. H. Zurek, and P. J. Coles, Variational consistent histories as a hybrid algorithm for quantum foundations, Nature communications 10, 1 (2019).
https:/​/​doi.org/​10.1038/​s41467-019-11417-0

[10] M. Cerezo, A. Poremba, L. Cincio, and P. J. Coles, Variational quantum fidelity estimation, Quantum 4, 248 (2020a).
https:/​/​doi.org/​10.22331/​q-2020-03-26-248

[11] C. Cirstoiu, Z. Holmes, J. Iosue, L. Cincio, P. J. Coles, and A. Sornborger, Variational fast forwarding for quantum simulation beyond the coherence time, npj Quantum Information 6, 1 (2020).
https:/​/​doi.org/​10.1038/​s41534-020-00302-0

[12] K. Sharma, S. Khatri, M. Cerezo, and P. J. Coles, Noise resilience of variational quantum compiling, New Journal of Physics 22, 043006 (2020a).
https:/​/​doi.org/​10.1088/​1367-2630/​ab784c

[13] C. Bravo-Prieto, R. LaRose, M. Cerezo, Y. Subasi, L. Cincio, and P. Coles, Variational quantum linear solver, arXiv preprint arXiv:1909.05820 (2019).
arXiv:1909.05820

[14] M. Cerezo, K. Sharma, A. Arrasmith, and P. J. Coles, Variational quantum state eigensolver, arXiv preprint arXiv:2004.01372 (2020b).
arXiv:2004.01372

[15] M. Schuld, I. Sinayskiy, and F. Petruccione, The quest for a quantum neural network, Quantum Information Processing 13, 2567 (2014).
https:/​/​doi.org/​10.1007/​s11128-014-0809-8

[16] I. Cong, S. Choi, and M. D. Lukin, Quantum convolutional neural networks, Nature Physics 15, 1273 (2019).
https:/​/​doi.org/​10.1038/​s41567-019-0648-8

[17] K. Beer, D. Bondarenko, T. Farrelly, T. J. Osborne, R. Salzmann, D. Scheiermann, and R. Wolf, Training deep quantum neural networks, Nature Communications 11, 808 (2020).
https:/​/​doi.org/​10.1038/​s41467-020-14454-2

[18] G. Verdon, J. Pye, and M. Broughton, A universal training algorithm for quantum deep learning, arXiv preprint arXiv:1806.09729 (2018).
arXiv:1806.09729

[19] J. R. McClean, S. Boixo, V. N. Smelyanskiy, R. Babbush, and H. Neven, Barren plateaus in quantum neural network training landscapes, Nature communications 9, 1 (2018).
https:/​/​doi.org/​10.1038/​s41467-018-07090-4

[20] M. Cerezo, A. Sone, T. Volkoff, L. Cincio, and P. J. Coles, Cost function dependent barren plateaus in shallow parametrized quantum circuits, Nature Communications 12, 1791 (2021b).
https:/​/​doi.org/​10.1038/​s41467-021-21728-w

[21] K. Sharma, M. Cerezo, L. Cincio, and P. J. Coles, Trainability of dissipative perceptron-based quantum neural networks, arXiv preprint arXiv:2005.12458 (2020b).
arXiv:2005.12458

[22] S. Wang, E. Fontana, M. Cerezo, K. Sharma, A. Sone, L. Cincio, and P. J. Coles, Noise-induced barren plateaus in variational quantum algorithms, arXiv preprint arXiv:2007.14384 (2020).
arXiv:2007.14384

[23] M. Cerezo and P. J. Coles, Higher order derivatives of quantum neural networks with barren plateaus, Quantum Science and Technology 6, 035006 (2021).
https:/​/​doi.org/​10.1088/​2058-9565/​abf51a

[24] Z. Holmes, A. Arrasmith, B. Yan, P. J. Coles, A. Albrecht, and A. T. Sornborger, Barren plateaus preclude learning scramblers, Physical Review Letters 126, 190501 (2021a).
https:/​/​doi.org/​10.1103/​PhysRevLett.126.190501

[25] A. Pesah, M. Cerezo, S. Wang, T. Volkoff, A. T. Sornborger, and P. J. Coles, Absence of barren plateaus in quantum convolutional neural networks, arXiv preprint arXiv:2011.02966 (2020).
arXiv:2011.02966

[26] K. Zhang, M.-H. Hsieh, L. Liu, and D. Tao, Toward trainability of quantum neural networks, arXiv preprint arXiv:2011.06258 (2020).
arXiv:2011.06258

[27] A. Abbas, D. Sutter, C. Zoufal, A. Lucchi, A. Figalli, and S. Woerner, The power of quantum neural networks, Nature Computational Science 1, 403 (2021).
https:/​/​doi.org/​10.1038/​s43588-021-00084-1

[28] C. O. Marrero, M. Kieferová, and N. Wiebe, Entanglement induced barren plateaus, arXiv preprint arXiv:2010.15968 (2020).
arXiv:2010.15968

[29] T. L. Patti, K. Najafi, X. Gao, and S. F. Yelin, Entanglement devised barren plateau mitigation, Physical Review Research 3, 033090 (2021).
https:/​/​doi.org/​10.1103/​PhysRevResearch.3.033090

[30] A. Uvarov and J. D. Biamonte, On barren plateaus and cost function locality in variational quantum algorithms, Journal of Physics A: Mathematical and Theoretical 54, 245301 (2021).
https:/​/​doi.org/​10.1088/​1751-8121/​abfac7

[31] Z. Holmes, K. Sharma, M. Cerezo, and P. J. Coles, Connecting ansatz expressibility to gradient magnitudes and barren plateaus, arXiv preprint arXiv:2101.02138 (2021b).
arXiv:2101.02138

[32] Y. Du, M.-H. Hsieh, T. Liu, S. You, and D. Tao, On the learnability of quantum neural networks, arXiv preprint arXiv:2007.12369 (2020).
arXiv:2007.12369

[33] A. Arrasmith, Z. Holmes, M. Cerezo, and P. J. Coles, Equivalence of quantum barren plateaus to cost concentration and narrow gorges, arXiv preprint arXiv:2104.05868 (2021).
arXiv:2104.05868

[34] G. Verdon, M. Broughton, J. R. McClean, K. J. Sung, R. Babbush, Z. Jiang, H. Neven, and M. Mohseni, Learning to learn with quantum neural networks via classical neural networks, arXiv preprint arXiv:1907.05415 (2019).
arXiv:1907.05415

[35] T. Volkoff and P. J. Coles, Large gradients via correlation in random parameterized quantum circuits, Quantum Science and Technology 6, 025008 (2021).
https:/​/​doi.org/​10.1088/​2058-9565/​abd89

[36] A. Skolik, J. R. McClean, M. Mohseni, P. van der Smagt, and M. Leib, Layerwise learning for quantum neural networks, Quantum Machine Intelligence 3, 1 (2021).
https:/​/​doi.org/​10.1007/​s42484-020-00036-4

[37] E. Grant, L. Wossnig, M. Ostaszewski, and M. Benedetti, An initialization strategy for addressing barren plateaus in parametrized quantum circuits, Quantum 3, 214 (2019).
https:/​/​doi.org/​10.22331/​q-2019-12-09-214

[38] E. Campos, A. Nasrallah, and J. Biamonte, Abrupt transitions in variational quantum circuit training, Physical Review A 103, 032607 (2021).
https:/​/​doi.org/​10.1103/​PhysRevA.103.032607

[39] E. Farhi and H. Neven, Classification with quantum neural networks on near term processors, arXiv preprint arXiv:1802.06002 (2018).
arXiv:1802.06002

[40] N. Killoran, T. R. Bromley, J. M. Arrazola, M. Schuld, N. Quesada, and S. Lloyd, Continuous-variable quantum neural networks, Physical Review Research 1, 033063 (2019).
https:/​/​doi.org/​10.1103/​PhysRevResearch.1.033063

[41] J. M. Kübler, A. Arrasmith, L. Cincio, and P. J. Coles, An adaptive optimizer for measurement-frugal variational algorithms, Quantum 4, 263 (2020).
https:/​/​doi.org/​10.22331/​q-2020-05-11-263

[42] R. Sweke, F. Wilde, J. J. Meyer, M. Schuld, P. K. Fährmann, B. Meynard-Piganeau, and J. Eisert, Stochastic gradient descent for hybrid quantum-classical optimization, Quantum 4, 314 (2020).
https:/​/​doi.org/​10.22331/​q-2020-08-31-314

[43] A. Arrasmith, L. Cincio, R. D. Somma, and P. J. Coles, Operator sampling for shot-frugal optimization in variational algorithms, arXiv preprint arXiv:2004.06252 (2020).
arXiv:2004.06252

[44] J. Stokes, J. Izaac, N. Killoran, and G. Carleo, Quantum natural gradient, Quantum 4, 269 (2020).
https:/​/​doi.org/​10.22331/​q-2020-05-25-269

[45] J. A. Nelder and R. Mead, A simplex method for function minimization, The computer journal 7, 308 (1965).
https:/​/​doi.org/​10.1093/​comjnl/​7.4.308

[46] R. R. Barton and J. S. Ivey Jr, Modifications of the nelder-mead simplex method for stochastic simulation response optimization, 1991 Winter Simulation Conference Proceedings (1991), 10.1109/​WSC.1991.185709.
https:/​/​doi.org/​10.1109/​WSC.1991.185709

[47] X. Huang, Robust simplex algorithm for online optimization, Physical Review Accelerators and Beams 21, 104601 (2018).
https:/​/​doi.org/​10.1103/​PhysRevAccelBeams.21.104601

[48] M. J. Powell, An efficient method for finding the minimum of a function of several variables without calculating derivatives, The computer journal 7, 155 (1964).
https:/​/​doi.org/​10.1093/​comjnl/​7.2.155

[49] R. P. Brent, Algorithms for minimization without derivatives (Courier Corporation, 2013).

[50] M. J. Powell, in Advances in optimization and numerical analysis (Springer, 1994) pp. 51–67.
https:/​/​doi.org/​10.1007/​978-94-015-8330-5_4

[51] A. W. Harrow and R. A. Low, Random quantum circuits are approximate 2-designs, Communications in Mathematical Physics 291, 257 (2009).
https:/​/​doi.org/​10.1007/​s00220-009-0873-6

[52] F. G. Brandao, A. W. Harrow, and M. Horodecki, Local random quantum circuits are approximate polynomial-designs, Communications in Mathematical Physics 346, 397 (2016).
https:/​/​doi.org/​10.1007/​s00220-016-2706-8

[53] A. Harrow and S. Mehraban, Approximate unitary $ t $-designs by short random quantum circuits using nearest-neighbor and long-range gates, arXiv preprint arXiv:1809.06957 (2018).
arXiv:1809.06957

[54] J. Močkus, in Optimization techniques IFIP technical conference (Springer, 1975) pp. 400–404.
https:/​/​doi.org/​10.1007/​978-3-662-38527-2_55

[55] A. Anand, M. Degroote, and A. Aspuru-Guzik, Natural evolutionary strategies for variational quantum computation, Machine Learning: Science and Technology (2021), 10.1088/​2632-2153/​abf3ac.
https:/​/​doi.org/​10.1088/​2632-2153/​abf3ac

Cited by

[1] M. Cerezo, Andrew Arrasmith, Ryan Babbush, Simon C. Benjamin, Suguru Endo, Keisuke Fujii, Jarrod R. McClean, Kosuke Mitarai, Xiao Yuan, Lukasz Cincio, and Patrick J. Coles, "Variational Quantum Algorithms", arXiv:2012.09265.

[2] Kishor Bharti, Alba Cervera-Lierta, Thi Ha Kyaw, Tobias Haug, Sumner Alperin-Lea, Abhinav Anand, Matthias Degroote, Hermanni Heimonen, Jakob S. Kottmann, Tim Menke, Wai-Keong Mok, Sukin Sim, Leong-Chuan Kwek, and Alán Aspuru-Guzik, "Noisy intermediate-scale quantum (NISQ) algorithms", arXiv:2101.08448.

[3] M. Cerezo, Akira Sone, Tyler Volkoff, Lukasz Cincio, and Patrick J. Coles, "Cost function dependent barren plateaus in shallow parametrized quantum circuits", Nature Communications 12, 1791 (2021).

[4] Samson Wang, Enrico Fontana, M. Cerezo, Kunal Sharma, Akira Sone, Lukasz Cincio, and Patrick J. Coles, "Noise-Induced Barren Plateaus in Variational Quantum Algorithms", arXiv:2007.14384.

[5] Michael Broughton, Guillaume Verdon, Trevor McCourt, Antonio J. Martinez, Jae Hyeon Yoo, Sergei V. Isakov, Philip Massey, Ramin Halavati, Murphy Yuezhen Niu, Alexander Zlokapa, Evan Peters, Owen Lockwood, Andrea Skolik, Sofiene Jerbi, Vedran Dunjko, Martin Leib, Michael Streif, David Von Dollen, Hongxiang Chen, Shuxiang Cao, Roeland Wiersema, Hsin-Yuan Huang, Jarrod R. McClean, Ryan Babbush, Sergio Boixo, Dave Bacon, Alan K. Ho, Hartmut Neven, and Masoud Mohseni, "TensorFlow Quantum: A Software Framework for Quantum Machine Learning", arXiv:2003.02989.

[6] Zoë Holmes, Kunal Sharma, M. Cerezo, and Patrick J. Coles, "Connecting ansatz expressibility to gradient magnitudes and barren plateaus", arXiv:2101.02138.

[7] M. Cerezo and Patrick J. Coles, "Higher order derivatives of quantum neural networks with barren plateaus", Quantum Science and Technology 6 3, 035006 (2021).

[8] Joe Gibbs, Kaitlin Gili, Zoë Holmes, Benjamin Commeau, Andrew Arrasmith, Lukasz Cincio, Patrick J. Coles, and Andrew Sornborger, "Long-time simulations with high fidelity on quantum hardware", arXiv:2102.04313.

[9] Taylor L. Patti, Khadijeh Najafi, Xun Gao, and Susanne F. Yelin, "Entanglement devised barren plateau mitigation", Physical Review Research 3 3, 033090 (2021).

[10] Zoë Holmes, Andrew Arrasmith, Bin Yan, Patrick J. Coles, Andreas Albrecht, and Andrew T. Sornborger, "Barren Plateaus Preclude Learning Scramblers", Physical Review Letters 126 19, 190501 (2021).

[11] Yong-Xin Yao, Niladri Gomes, Feng Zhang, Cai-Zhuang Wang, Kai-Ming Ho, Thomas Iadecola, and Peter P. Orth, "Adaptive Variational Quantum Dynamics Simulations", PRX Quantum 2 3, 030307 (2021).

[12] M. Bilkis, M. Cerezo, Guillaume Verdon, Patrick J. Coles, and Lukasz Cincio, "A semi-agnostic ansatz with variable structure for quantum machine learning", arXiv:2103.06712.

[13] Tobias Haug and Kishor Bharti, "Generalized Quantum Assisted Simulator", arXiv:2011.14737.

[14] Ranyiliu Chen, Zhixin Song, Xuanqiang Zhao, and Xin Wang, "Variational Quantum Algorithms for Trace Distance and Fidelity Estimation", arXiv:2012.05768.

[15] Andrew Arrasmith, Zoë Holmes, M. Cerezo, and Patrick J. Coles, "Equivalence of quantum barren plateaus to cost concentration and narrow gorges", arXiv:2104.05868.

[16] Martin Larocca, Piotr Czarnik, Kunal Sharma, Gopikrishnan Muraleedharan, Patrick J. Coles, and M. Cerezo, "Diagnosing barren plateaus with tools from quantum optimal control", arXiv:2105.14377.

[17] Abhinav Anand, Matthias Degroote, and Alán Aspuru-Guzik, "Natural Evolutionary Strategies for Variational Quantum Computation", arXiv:2012.00101.

[18] Nikolay V. Tkachenko, James Sud, Yu Zhang, Sergei Tretiak, Petr M. Anisimov, Andrew T. Arrasmith, Patrick J. Coles, Lukasz Cincio, and Pavel A. Dub, "Correlation-Informed Permutation of Qubits for Reducing Ansatz Depth in the Variational Quantum Eigensolver", PRX Quantum 2 2, 020337 (2021).

[19] Dmitry A. Fedorov, Bo Peng, Niranjan Govind, and Yuri Alexeev, "VQE Method: A Short Survey and Recent Developments", arXiv:2103.08505.

[20] Samson Wang, Piotr Czarnik, Andrew Arrasmith, M. Cerezo, Lukasz Cincio, and Patrick J. Coles, "Can Error Mitigation Improve Trainability of Noisy Variational Quantum Algorithms?", arXiv:2109.01051.

[21] Yidong Liao, Min-Hsiu Hsieh, and Chris Ferrie, "Quantum Optimization for Training Quantum Neural Networks", arXiv:2103.17047.

[22] Michael R. Geller, Zoë Holmes, Patrick J. Coles, and Andrew Sornborger, "Experimental quantum learning of a spectral decomposition", Physical Review Research 3 3, 033200 (2021).

[23] Martin Larocca, Nathan Ju, Diego García-Martín, Patrick J. Coles, and M. Cerezo, "Theory of overparametrization in quantum neural networks", arXiv:2109.11676.

[24] Andi Gu, Angus Lowe, Pavel A. Dub, Patrick J. Coles, and Andrew Arrasmith, "Adaptive shot allocation for fast convergence in variational quantum algorithms", arXiv:2108.10434.

[25] Cristian L. Cortes and Stephen K. Gray, "Quantum Krylov subspace algorithms for ground and excited state energy estimation", arXiv:2109.06868.

[26] Louis Schatzki, Andrew Arrasmith, Patrick J. Coles, and M. Cerezo, "Entangled Datasets for Quantum Machine Learning", arXiv:2109.03400.

[27] Nahum Sá, Ivan S. Oliveira, and Itzhak Roditi, "Towards solving the BCS Hamiltonian gap in Near-Term Quantum Computers", arXiv:2105.14936.

[28] Hiroshi C. Watanabe, Rudy Raymond, Yu-ya Ohnishi, Eriko Kaminishi, and Michihiko Sugawara, "Optimizing Parameterized Quantum Circuits with Free-Axis Selection", arXiv:2104.14875.

[29] Zidu Liu, Li-Wei Yu, L. -M. Duan, and Dong-Ling Deng, "The Presence and Absence of Barren Plateaus in Tensor-network Based Machine Learning", arXiv:2108.08312.

[30] Beng Yee Gan, Daniel Leykam, and Dimitris G. Angelakis, "Fock State-enhanced Expressivity of Quantum Machine Learning Models", arXiv:2107.05224.

[31] Tyler Volkoff, Zoë Holmes, and Andrew Sornborger, "Universal compiling and (No-)Free-Lunch theorems for continuous variable quantum learning", arXiv:2105.01049.

[32] Brian Coyle, Mina Doosti, Elham Kashefi, and Niraj Kumar, "Variational Quantum Cloning: Improving Practicality for Quantum Cryptanalysis", arXiv:2012.11424.

[33] Bingzhi Zhang and Quntao Zhuang, "Fast suppression of classification error in variational quantum circuits", arXiv:2107.08026.

[34] Chiara Leadbeater, Louis Sharrock, Brian Coyle, and Marcello Benedetti, "F-Divergences and Cost Function Locality in Generative Modelling with Quantum Circuits", Entropy 23 10, 1281 (2021).

[35] Owen Lockwood, "Optimizing Quantum Variational Circuits with Deep Reinforcement Learning", arXiv:2109.03188.

[36] Raoul Heese, Patricia Bickert, and Astrid Elisa Niederle, "Representation of binary classification trees with binary features by quantum circuits", arXiv:2108.13207.

[37] Maria Kieferova, Ortiz Marrero Carlos, and Nathan Wiebe, "Quantum Generative Training Using Rényi Divergences", arXiv:2106.09567.

The above citations are from SAO/NASA ADS (last updated successfully 2021-10-24 18:17:34). The list may be incomplete as not all publishers provide suitable and complete citation data.

On Crossref's cited-by service no data on citing works was found (last attempt 2021-10-24 18:17:32).