On Quantum Speedups for Nonconvex Optimization via Quantum Tunneling Walks

Yizhou Liu1, Weijie J. Su2, and Tongyang Li3,4

1Department of Engineering Mechanics, Tsinghua University, 100084 Beijing, China
2Department of Statistics and Data Science, University of Pennsylvania
3Center on Frontiers of Computing Studies, Peking University, 100871 Beijing, China
4School of Computer Science, Peking University, 100871 Beijing, China

Find this paper interesting or want to discuss? Scite or leave a comment on SciRate.

Abstract

Classical algorithms are often not effective for solving nonconvex optimization problems where local minima are separated by high barriers. In this paper, we explore possible quantum speedups for nonconvex optimization by leveraging the $global$ effect of quantum tunneling. Specifically, we introduce a quantum algorithm termed the quantum tunneling walk (QTW) and apply it to nonconvex problems where local minima are approximately global minima. We show that QTW achieves quantum speedup over classical stochastic gradient descents (SGD) when the barriers between different local minima are high but thin and the minima are flat. Based on this observation, we construct a specific double-well landscape, where classical algorithms cannot efficiently hit one target well knowing the other well but QTW can when given proper initial states near the known well. Finally, we corroborate our findings with numerical experiments.

Classical algorithms are often not effective for solving nonconvex optimization problems where local minima are separated by high barriers. In this paper, we explore possible quantum speedups for nonconvex optimization by leveraging the global effect of quantum tunneling. Specifically, we introduce a quantum algorithm termed the quantum tunneling walk (QTW) and apply it to nonconvex problems where local minima are approximately global minima. We show that QTW achieves quantum speedup over classical stochastic gradient descents (SGD) when the barriers between different local minima are high but thin and the minima are flat. Based on this observation, we construct a specific double-well landscape, where classical algorithms cannot efficiently hit one target well knowing the other well but QTW can when given proper initial states near the known well. Finally, we corroborate our findings with numerical experiments.

► BibTeX data

► References

[1] Zeyuan Allen-Zhu and Yuanzhi Li. Neon2: Finding local minima via first-order oracles. In Advances in Neural Information Processing Systems, pages 3716–3726, 2018. URL http:/​/​papers.neurips.cc/​paper/​7629-neon2-finding-local-minima-via-first-order-oracles.pdf. arXiv:1711.06673.
arXiv:1711.06673
http:/​/​papers.neurips.cc/​paper/​7629-neon2-finding-local-minima-via-first-order-oracles.pdf

[2] Animashree Anandkumar, Rong Ge, Daniel Hsu, Sham M Kakade, and Matus Telgarsky. Tensor decompositions for learning latent variable models. Journal of machine learning research, 15: 2773–2832, 2014. URL https:/​/​jmlr.org/​papers/​volume15/​anandkumar14b/​. arXiv:1210.7559v4.
arXiv:1210.7559v4
https:/​/​jmlr.org/​papers/​volume15/​anandkumar14b/​

[3] Ben Andrews and Julie Clutterbuck. Proof of the fundamental gap conjecture. Journal of the American Mathematical Society, 24 (3): 899–916, 2011. ISSN 08940347, 10886834. URL http:/​/​www.jstor.org/​stable/​23072145. arXiv:1006.1686.
arXiv:1006.1686
http:/​/​www.jstor.org/​stable/​23072145

[4] Joran van Apeldoorn and András Gilyén. Improvements in quantum SDP-solving with applications. In Proceedings of the 46th International Colloquium on Automata, Languages, and Programming, volume 132 of Leibniz International Proceedings in Informatics (LIPIcs), pages 99:1–99:15. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2019. 10.4230/​LIPIcs.ICALP.2019.99. arXiv:1804.05058.
https:/​/​doi.org/​10.4230/​LIPIcs.ICALP.2019.99
arXiv:1804.05058

[5] Joran van Apeldoorn, András Gilyén, Sander Gribling, and Ronald de Wolf. Quantum SDP-solvers: Better upper and lower bounds. In Proceedings of the 58th Annual Symposium on Foundations of Computer Science. IEEE, 2017. 10.1109/​FOCS.2017.44. arXiv:1705.01843.
https:/​/​doi.org/​10.1109/​FOCS.2017.44
arXiv:1705.01843

[6] Joran van Apeldoorn, András Gilyén, Sander Gribling, and Ronald de Wolf. Convex optimization using quantum oracles. Quantum, 4: 220, 2020. 10.22331/​q-2020-01-13-220. arXiv:1809.00643.
https:/​/​doi.org/​10.22331/​q-2020-01-13-220
arXiv:1809.00643

[7] Frank Arute, Kunal Arya, Ryan Babbush, Dave Bacon, Joseph C. Bardin, Rami Barends, Sergio Boixo, Michael Broughton, Bob B. Buckley, David A. Buell, Brian Burkett, Nicholas Bushnell, Yu Chen, Zijun Chen, Benjamin Chiaro, Roberto Collins, William Courtney, Sean Demura, Andrew Dunsworth, Edward Farhi, Austin Fowler, Brooks Foxen, Craig Gidney, Marissa Giustina, Rob Graff, Steve Habegger, Matthew P. Harrigan, Alan Ho, Sabrina Hong, Trent Huang, William J. Huggins, Lev Ioffe, Sergei V. Isakov, Evan Jeffrey, Zhang Jiang, Cody Jones, Dvir Kafri, Kostyantyn Kechedzhi, Julian Kelly, Seon Kim, Paul V. Klimov, Alexander Korotkov, Fedor Kostritsa, David Landhuis, Pavel Laptev, Mike Lindmark, Erik Lucero, Orion Martin, John M. Martinis, Jarrod R. McClean, Matt McEwen, Anthony Megrant, Xiao Mi, Masoud Mohseni, Wojciech Mruczkiewicz, Josh Mutus, Ofer Naaman, Matthew Neeley, Charles Neill, Hartmut Neven, Murphy Yuezhen Niu, Thomas E. O’Brien, Eric Ostby, Andre Petukhov, Harald Putterman, Chris Quintana, Pedram Roushan, Nicholas C. Rubin, Daniel Sank, Kevin J. Satzinger, Vadim Smelyanskiy, Doug Strain, Kevin J. Sung, Marco Szalay, Tyler Y. Takeshita, Amit Vainsencher, Theodore White, Nathan Wiebe, Z. Jamie Yao, Ping Yeh, and Adam Zalcman. Hartree-Fock on a superconducting qubit quantum computer. Science, 369 (6507): 1084–1089, 2020. 10.1126/​science.abb9811. URL https:/​/​science.sciencemag.org/​content/​369/​6507/​1084.abstract. arXiv:2004.04174.
https:/​/​doi.org/​10.1126/​science.abb9811
arXiv:2004.04174
https:/​/​science.sciencemag.org/​content/​369/​6507/​1084.abstract

[8] Yosi Atia and Shantanav Chakraborty. Improved upper bounds for the hitting times of quantum walks. Physical Review A, 104: 032215, Sep 2021. ISSN 2469-9934. 10.1103/​physreva.104.032215. URL http:/​/​dx.doi.org/​10.1103/​PhysRevA.104.032215. arXiv:2005.04062v5.
https:/​/​doi.org/​10.1103/​physreva.104.032215
arXiv:2005.04062v5

[9] Carlo Baldassi and Riccardo Zecchina. Efficiency of quantum vs. classical annealing in nonconvex learning problems. Proceedings of the National Academy of Sciences, 115 (7): 1457–1462, Jan 2018. ISSN 1091-6490. 10.1073/​pnas.1711456115. URL http:/​/​dx.doi.org/​10.1073/​pnas.1711456115. arXiv:1706.08470.
https:/​/​doi.org/​10.1073/​pnas.1711456115
arXiv:1706.08470

[10] Charles H. Bennett, Ethan Bernstein, Gilles Brassard, and Umesh Vazirani. Strengths and weaknesses of quantum computing. SIAM Journal on Computing, 26 (5): 1510–1523, 1997. 10.1137/​S0097539796300933. URL https:/​/​doi.org/​10.1137/​S0097539796300933. arXiv:quant-ph/​9701001.
https:/​/​doi.org/​10.1137/​S0097539796300933
arXiv:quant-ph/9701001

[11] Michael Betancourt, Michael I. Jordan, and Ashia C Wilson. On symplectic optimization, 2018. arXiv:1802.03653.
arXiv:1802.03653

[12] Sergio Boixo and Rolando D. Somma. Necessary condition for the quantum adiabatic approximation. Physical Review A, 81 (3): 032308, 2010. 10.1103/​PhysRevA.81.032308. URL https:/​/​journals.aps.org/​pra/​abstract/​10.1103/​PhysRevA.81.032308. arXiv:0911.1362.
https:/​/​doi.org/​10.1103/​PhysRevA.81.032308
arXiv:0911.1362

[13] Fernando G.S.L. Brandão and Krysta Svore. Quantum speed-ups for semidefinite programming. In Proceedings of the 58th Annual Symposium on Foundations of Computer Science, pages 415–426, 2017. 10.1109/​FOCS.2017.45. arXiv:1609.05537.
https:/​/​doi.org/​10.1109/​FOCS.2017.45
arXiv:1609.05537

[14] Fernando G.S.L. Brandão, Amir Kalev, Tongyang Li, Cedric Yen-Yu Lin, Krysta M. Svore, and Xiaodi Wu. Quantum SDP solvers: Large speed-ups, optimality, and applications to quantum learning. In Proceedings of the 46th International Colloquium on Automata, Languages, and Programming, volume 132 of Leibniz International Proceedings in Informatics (LIPIcs), pages 27:1–27:14. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2019. 10.4230/​LIPIcs.ICALP.2019.27. arXiv:1710.02581.
https:/​/​doi.org/​10.4230/​LIPIcs.ICALP.2019.27
arXiv:1710.02581

[15] Shouvanik Chakrabarti, Andrew M. Childs, Tongyang Li, and Xiaodi Wu. Quantum algorithms and lower bounds for convex optimization. Quantum, 4: 221, 2020. 10.22331/​q-2020-01-13-221. arXiv:1809.01731.
https:/​/​doi.org/​10.22331/​q-2020-01-13-221
arXiv:1809.01731

[16] Shantanav Chakraborty, Kyle Luh, and Jérémie Roland. How fast do quantum walks mix? Physical Review Letters, 124: 050501, Feb 2020. 10.1103/​PhysRevLett.124.050501. URL https:/​/​link.aps.org/​doi/​10.1103/​PhysRevLett.124.050501. arXiv:2001.06305v1.
https:/​/​doi.org/​10.1103/​PhysRevLett.124.050501
arXiv:2001.06305v1

[17] Pratik Chaudhari and Stefano Soatto. Stochastic gradient descent performs variational inference, converges to limit cycles for deep networks. In 2018 Information Theory and Applications Workshop (ITA), pages 1–10, 2018. 10.1109/​ITA.2018.8503224. arXiv:1710.11029v2.
https:/​/​doi.org/​10.1109/​ITA.2018.8503224
arXiv:1710.11029v2

[18] Andrew M. Childs, Richard Cleve, Enrico Deotto, Edward Farhi, Sam Gutmann, and Daniel A. Spielman. Exponential algorithmic speedup by a quantum walk. In Proceedings of the Thirty-Fifth Annual ACM Symposium on Theory of Computing, STOC '03, page 59–68, New York, NY, USA, 2003. Association for Computing Machinery. ISBN 1581136749. 10.1145/​780542.780552. URL https:/​/​doi.org/​10.1145/​780542.780552. arXiv:quant-ph/​0209131v2.
https:/​/​doi.org/​10.1145/​780542.780552
arXiv:quant-ph/0209131v2

[19] Andrew M. Childs, Jin-Peng Liu, and Aaron Ostrander. High-precision quantum algorithms for partial differential equations. Quantum, 5: 574, Nov 2021. ISSN 2521-327X. 10.22331/​q-2021-11-10-574. URL http:/​/​dx.doi.org/​10.22331/​q-2021-11-10-574. arXiv:2002.07868.
https:/​/​doi.org/​10.22331/​q-2021-11-10-574
arXiv:2002.07868

[20] Pierre Comon, Xavier Luciani, and André L. F. De Almeida. Tensor decompositions, alternating least squares and other tales. Journal of Chemometrics, 23: 393–405, August 2009. 10.1002/​cem.1236. URL https:/​/​hal.archives-ouvertes.fr/​hal-00410057.
https:/​/​doi.org/​10.1002/​cem.1236
https:/​/​hal.archives-ouvertes.fr/​hal-00410057

[21] Pedro C. S. Costa, Stephen Jordan, and Aaron Ostrander. Quantum algorithm for simulating the wave equation. Physical Review A, 99: 012323, Jan 2019. 10.1103/​PhysRevA.99.012323. URL https:/​/​link.aps.org/​doi/​10.1103/​PhysRevA.99.012323. arXiv:1711.05394.
https:/​/​doi.org/​10.1103/​PhysRevA.99.012323
arXiv:1711.05394

[22] Christopher Criscitiello and Nicolas Boumal. Negative curvature obstructs acceleration for geodesically convex optimization, even with exact first-order oracles, 2021. arXiv:2111.13263.
arXiv:2111.13263

[23] Elizabeth Crosson and Aram W. Harrow. Simulated quantum annealing can be exponentially faster than classical simulated annealing. In 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS), pages 714–723. IEEE, Oct 2016. 10.1109/​focs.2016.81. URL http:/​/​dx.doi.org/​10.1109/​FOCS.2016.81. arXiv:1601.03030.
https:/​/​doi.org/​10.1109/​focs.2016.81
arXiv:1601.03030

[24] Mouez Dimassi and Johannes Sjöstrand. Spectral Asymptotics in the Semi-Classical Limit. London Mathematical Society Lecture Note Series. Cambridge University Press, 1999. 10.1017/​CBO9780511662195.
https:/​/​doi.org/​10.1017/​CBO9780511662195

[25] Felix Draxler, Kambis Veschgini, Manfred Salmhofer, and Fred Hamprecht. Essentially no barriers in neural network energy landscape. In International Conference on Machine Learning, pages 1309–1318. PMLR, 2018. URL http:/​/​proceedings.mlr.press/​v80/​draxler18a.html. arXiv:1803.00885.
arXiv:1803.00885
http:/​/​proceedings.mlr.press/​v80/​draxler18a.html

[26] Runyao Duan. Quantum adiabatic theorem revisited, 2020. arXiv:2003.03063v1.
arXiv:2003.03063v1

[27] John Duchi, Elad Hazan, and Yoram Singer. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12 (61): 2121–2159, 2011. URL https:/​/​www.jmlr.org/​papers/​volume12/​duchi11a/​duchi11a.pdf.
https:/​/​www.jmlr.org/​papers/​volume12/​duchi11a/​duchi11a.pdf

[28] Sepehr Ebadi, Tout T. Wang, Harry Levine, Alexander Keesling, Giulia Semeghini, Ahmed Omran, Dolev Bluvstein, Rhine Samajdar, Hannes Pichler, Wen Wei Ho, Soonwon Choi, Subir Sachdev, Markus Greiner, Vladan Vuletić, and Mikhail D. Lukin. Quantum phases of matter on a 256-atom programmable quantum simulator. Nature, 595 (7866): 227–232, 2021. 10.1038/​s41586-021-03582-4. URL https:/​/​www.nature.com/​articles/​s41586-021-03582-4.
https:/​/​doi.org/​10.1038/​s41586-021-03582-4
https:/​/​www.nature.com/​articles/​s41586-021-03582-4

[29] Cong Fang, Chris Junchi Li, Zhouchen Lin, and Tong Zhang. Spider: Near-optimal non-convex optimization via stochastic path-integrated differential estimator. In Advances in Neural Information Processing Systems, pages 689–699, 2018. URL https:/​/​dl.acm.org/​doi/​abs/​10.5555/​3326943.3327007. arXiv:1807.01695.
arXiv:1807.01695
https:/​/​dl.acm.org/​doi/​abs/​10.5555/​3326943.3327007

[30] Cong Fang, Zhouchen Lin, and Tong Zhang. Sharp analysis for nonconvex SGD escaping from saddle points. In Conference on Learning Theory, pages 1192–1234, 2019. URL http:/​/​proceedings.mlr.press/​v99/​fang19a.html. arXiv:1902.00247.
arXiv:1902.00247
http:/​/​proceedings.mlr.press/​v99/​fang19a.html

[31] Edward Farhi, Jeffrey Goldstone, Sam Gutmann, Joshua Lapan, Andrew Lundgren, and Daniel Preda. A quantum adiabatic evolution algorithm applied to random instances of an NP-complete problem. Science, 292 (5516): 472–475, Apr 2001. ISSN 1095-9203. 10.1126/​science.1057726. URL http:/​/​dx.doi.org/​10.1126/​science.1057726. arXiv:quant-ph/​0104129.
https:/​/​doi.org/​10.1126/​science.1057726
arXiv:quant-ph/0104129

[32] A.B. Finnila, M.A. Gomez, C. Sebenik, C. Stenson, and J.D. Doll. Quantum annealing: A new method for minimizing multidimensional functions. Chemical Physics Letters, 219 (5-6): 343–348, Mar 1994. ISSN 0009-2614. 10.1016/​0009-2614(94)00117-0. URL http:/​/​dx.doi.org/​10.1016/​0009-2614(94)00117-0. arXiv:chem-ph/​9404003.
https:/​/​doi.org/​10.1016/​0009-2614(94)00117-0
arXiv:chem-ph/9404003

[33] Mauger François. Symplectic leap frog scheme, 2020. URL https:/​/​www.mathworks.com/​matlabcentral/​fileexchange/​38652-symplectic-leap-frog-scheme. https: /​/​www.mathworks.com /​matlabcentral /​fileexchange /​38652-symplectic-leap-frog-scheme.
https:/​/​www.mathworks.com/​matlabcentral/​fileexchange/​38652-symplectic-leap-frog-scheme

[34] Alan Frieze, Mark Jerrum, and Ravi Kannan. Learning linear transformations. In Proceedings of 37th Conference on Foundations of Computer Science, pages 359–368, 1996. 10.1109/​SFCS.1996.548495.
https:/​/​doi.org/​10.1109/​SFCS.1996.548495

[35] Timur Garipov, Pavel Izmailov, Dmitrii Podoprikhin, Dmitry Vetrov, and Andrew Gordon Wilson. Loss surfaces, mode connectivity, and fast ensembling of DNNs. In Advances in Neural Information Processing Systems, pages 8803–8812, 2018. URL https:/​/​dl.acm.org/​doi/​abs/​10.5555/​3327546.3327556. arXiv:1802.10026.
arXiv:1802.10026
https:/​/​dl.acm.org/​doi/​abs/​10.5555/​3327546.3327556

[36] Rong Ge and Tengyu Ma. On the optimization landscape of tensor decompositions. Mathematical Programming, pages 1–47, 2020. ISSN 1436-4646. 10.1007/​s10107-020-01579-x. URL https:/​/​doi.org/​10.1007/​s10107-020-01579-x. arXiv:1706.05598v1.
https:/​/​doi.org/​10.1007/​s10107-020-01579-x
arXiv:1706.05598v1

[37] Rong Ge, Furong Huang, Chi Jin, and Yang Yuan. Escaping from saddle points – online stochastic gradient for tensor decomposition. In Proceedings of the 28th Conference on Learning Theory, volume 40 of Proceedings of Machine Learning Research, pages 797–842, 2015. URL http:/​/​proceedings.mlr.press/​v40/​Ge15. arXiv:1503.02101.
arXiv:1503.02101
http:/​/​proceedings.mlr.press/​v40/​Ge15

[38] Rong Ge, Jason D. Lee, and Tengyu Ma. Matrix completion has no spurious local minimum. In Advances in Neural Information Processing Systems, pages 2981–2989, 2016. URL https:/​/​dl.acm.org/​doi/​abs/​10.5555/​3157382.3157431. arXiv:1605.07272.
arXiv:1605.07272
https:/​/​dl.acm.org/​doi/​abs/​10.5555/​3157382.3157431

[39] Ming Gong, Shiyu Wang, Chen Zha, Ming-Cheng Chen, He-Liang Huang, Yulin Wu, Qingling Zhu, Youwei Zhao, Shaowei Li, Shaojun Guo, Haoran Qian, Yangsen Ye, Fusheng Chen, Chong Ying, Jiale Yu, Daojin Fan, Dachao Wu, Hong Su, Hui Deng, Hao Rong, Kaili Zhang, Sirui Cao, Jin Lin, Yu Xu, Lihua Sun, Cheng Guo, Na Li, Futian Liang, V. M. Bastidas, Kae Nemoto, W. J. Munro, Yong-Heng Huo, Chao-Yang Lu, Cheng-Zhi Peng, Xiaobo Zhu, and Jian-Wei Pan. Quantum walks on a programmable two-dimensional 62-qubit superconducting processor. Science, 372 (6545): 948–952, 2021. 10.1126/​science.abg7812. URL https:/​/​science.sciencemag.org/​content/​372/​6545/​948.abstract. arXiv:2102.02573.
https:/​/​doi.org/​10.1126/​science.abg7812
arXiv:2102.02573
https:/​/​science.sciencemag.org/​content/​372/​6545/​948.abstract

[40] Stephen K. Gray and David E. Manolopoulos. Symplectic integrators tailored to the time‐dependent schrödinger equation. The Journal of Chemical Physics, 104 (18): 7099–7112, 1996. 10.1063/​1.471428. URL https:/​/​doi.org/​10.1063/​1.471428.
https:/​/​doi.org/​10.1063/​1.471428

[41] Bernard Helffer. Semi-Classical Analysis for the Schrödinger Operator and Applications. Lecture Notes in Mathematics. Springer, 1988. 10.1007/​BFb0078115.
https:/​/​doi.org/​10.1007/​BFb0078115

[42] Bernard Helffer and Johannes Sjöstrand. Multiple wells in the semi-classical limit I. Communications in Partial Differential Equations, 9 (4): 337–408, 1984. 10.1080/​03605308408820335.
https:/​/​doi.org/​10.1080/​03605308408820335

[43] Bernard Helffer and Johannes Sjöstrand. Multiple wells in the semi-classical limit III - interaction through non-resonant wells. Mathematische Nachrichten, 124 (1): 263–313, 1985. https:/​/​doi.org/​10.1002/​mana.19851240117. URL https:/​/​onlinelibrary.wiley.com/​doi/​abs/​10.1002/​mana.19851240117.
https:/​/​doi.org/​10.1002/​mana.19851240117

[44] Sepp Hochreiter. The vanishing gradient problem during learning recurrent neural nets and problem solutions. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 6 (02): 107–116, 1998. 10.1142/​S0218488598000094. URL https:/​/​dl.acm.org/​doi/​abs/​10.1142/​S0218488598000094.
https:/​/​doi.org/​10.1142/​S0218488598000094

[45] Aapo Hyvarinen. Fast ICA for noisy data using gaussian moments. In 1999 IEEE International Symposium on Circuits and Systems (ISCAS), volume 5, pages 57–61, 1999. 10.1109/​ISCAS.1999.777510.
https:/​/​doi.org/​10.1109/​ISCAS.1999.777510

[46] Frédéric Hérau, Michael Hitrik, and Johannes Sjöstrand. Tunnel effect and symmetries for kramers–fokker–planck type operators. Journal of the Institute of Mathematics of Jussieu, 10 (3): 567–634, 2011. 10.1017/​S1474748011000028.
https:/​/​doi.org/​10.1017/​S1474748011000028

[47] Chi Jin, Rong Ge, Praneeth Netrapalli, Sham M. Kakade, and Michael I. Jordan. How to escape saddle points efficiently. In Proceedings of the 34th International Conference on Machine Learning, volume 70, pages 1724–1732, 2017. URL http:/​/​proceedings.mlr.press/​v70/​jin17a. arXiv:1703.00887.
arXiv:1703.00887
http:/​/​proceedings.mlr.press/​v70/​jin17a

[48] Chi Jin, Lydia T. Liu, Rong Ge, and Michael I. Jordan. On the local minima of the empirical risk. In Advances in Neural Information Processing Systems, volume 31, page 4901–4910. Curran Associates, Inc., 2018. URL https:/​/​proceedings.neurips.cc/​paper/​2018/​file/​da4902cb0bc38210839714ebdcf0efc3-Paper.pdf. arXiv:1803.09357.
arXiv:1803.09357
https:/​/​proceedings.neurips.cc/​paper/​2018/​file/​da4902cb0bc38210839714ebdcf0efc3-Paper.pdf

[49] Chi Jin, Praneeth Netrapalli, Rong Ge, Sham M Kakade, and Michael I. Jordan. On nonconvex optimization for machine learning: Gradients, stochasticity, and saddle points. Journal of the ACM (JACM), 68 (2): 1–29, 2021. 10.1145/​3418526. URL https:/​/​dl.acm.org/​doi/​abs/​10.1145/​3418526. arXiv:1902.04811.
https:/​/​doi.org/​10.1145/​3418526
arXiv:1902.04811

[50] Michael I. Jordan. Dynamical, symplectic and stochastic perspectives on gradient-based optimization. In Proceedings of the International Congress of Mathematicians: Rio de Janeiro 2018, pages 523–549. World Scientific, 2018. URL https:/​/​doi.org/​10.1142/​9789813272880_0022.
https:/​/​doi.org/​10.1142/​9789813272880_0022

[51] Kenji Kawaguchi, Jiaoyang Huang, and Leslie Pack Kaelbling. Every local minimum value is the global minimum value of induced model in nonconvex machine learning. Neural Computation, 31 (12): 2293–2323, 12 2019. ISSN 0899-7667. 10.1162/​neco_a_01234. URL https:/​/​doi.org/​10.1162/​neco_a_01234. arXiv:1904.03673v3.
https:/​/​doi.org/​10.1162/​neco_a_01234
arXiv:1904.03673v3

[52] Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. In 3rd International Conference for Learning Representations, 2015. URL https:/​/​openreview.net/​forum?id=8gmWwjFyLj. arXiv:1412.6980.
arXiv:1412.6980
https:/​/​openreview.net/​forum?id=8gmWwjFyLj

[53] Alexei Kitaev and William A. Webb. Wavefunction preparation and resampling using a quantum computer, 2008. arXiv:0801.0342.
arXiv:0801.0342

[54] Bobby Kleinberg, Yuanzhi Li, and Yang Yuan. An alternative view: When does SGD escape local minima? In International Conference on Machine Learning, pages 2698–2707. PMLR, 2018. URL http:/​/​proceedings.mlr.press/​v80/​kleinberg18a.html. arXiv:1802.06175.
arXiv:1802.06175
http:/​/​proceedings.mlr.press/​v80/​kleinberg18a.html

[55] Guy Kornowski and Ohad Shamir. Oracle complexity in nonsmooth nonconvex optimization. In Advances in Neural Information Processing Systems, 2021. URL https:/​/​openreview.net/​forum?id=aMZJBOiOOPg. arXiv:2104.06763v2.
arXiv:2104.06763v2
https:/​/​openreview.net/​forum?id=aMZJBOiOOPg

[56] Rohith Kuditipudi, Xiang Wang, Holden Lee, Yi Zhang, Zhiyuan Li, Wei Hu, Rong Ge, and Sanjeev Arora. Explaining landscape connectivity of low-cost solutions for multilayer nets. Advances in Neural Information Processing Systems, 32: 14601–14610, 2019. URL http:/​/​papers.nips.cc/​paper/​9602-explaining-landscape-connectivity-of-low-cost-solutions-for-multilayer-nets. arXiv:1906.06247.
arXiv:1906.06247
http:/​/​papers.nips.cc/​paper/​9602-explaining-landscape-connectivity-of-low-cost-solutions-for-multilayer-nets

[57] Harold J. Kushner and G. George Yin. Stochastic Approximation and Recursive Algorithms and Applications, volume 35. Springer Science & Business Media, 2003. 10.1007/​978-1-4471-4285-0_3.
https:/​/​doi.org/​10.1007/​978-1-4471-4285-0_3

[58] Keren Li, Shijie Wei, Pan Gao, Feihao Zhang, Zengrong Zhou, Tao Xin, Xiaoting Wang, Patrick Rebentrost, and Guilu Long. Optimizing a polynomial function on a quantum processor. npj Quantum Information, 7 (1): 1–7, 2021a. 10.1038/​s41534-020-00351-5. arXiv:1804.05231.
https:/​/​doi.org/​10.1038/​s41534-020-00351-5
arXiv:1804.05231

[59] Zhiyuan Li, Sadhika Malladi, and Sanjeev Arora. On the validity of modeling SGD with stochastic differential equations (SDEs). In Advances in Neural Information Processing Systems, 2021b. URL https:/​/​openreview.net/​forum?id=goEdyJ_nVQI. arXiv:2102.12470.
arXiv:2102.12470
https:/​/​openreview.net/​forum?id=goEdyJ_nVQI

[60] Guang Hao Low and Nathan Wiebe. Hamiltonian simulation in the interaction picture, 2019. URL https:/​/​arxiv.org/​abs/​1805.00675v2. arXiv:1805.00675v2.
arXiv:1805.00675v2

[61] Cong Ma, Kaizheng Wang, Yuejie Chi, and Yuxin Chen. Implicit regularization in nonconvex statistical estimation: Gradient descent converges linearly for phase retrieval and matrix completion. In International Conference on Machine Learning, pages 3345–3354. PMLR, 2018. URL http:/​/​proceedings.mlr.press/​v80/​ma18c.html. arXiv:1711.10467.
arXiv:1711.10467
http:/​/​proceedings.mlr.press/​v80/​ma18c.html

[62] Tengyu Ma. Why Do Local Methods Solve Nonconvex Problems?, page 465–485. Cambridge University Press, 2021. 10.1017/​9781108637435.027. arXiv:2103.13462.
https:/​/​doi.org/​10.1017/​9781108637435.027
arXiv:2103.13462

[63] Yi-An Ma, Yuansi Chen, Chi Jin, Nicolas Flammarion, and Michael I. Jordan. Sampling can be faster than optimization. Proceedings of the National Academy of Sciences, 116 (42): 20881–20885, 2019. URL https:/​/​www.pnas.org/​content/​116/​42/​20881.short. arXiv:.
https:/​/​doi.org/​10.1073/​pnas.1820003116
https:/​/​www.pnas.org/​content/​116/​42/​20881.short

[64] Peter A. Markowich and Cédric Villani. On the trend to equilibrium for the Fokker-Planck equation: An interplay between physics and functional analysis. In Physics and Functional Analysis, Matematica Contemporanea (SBM) 19. Citeseer, 1999. URL http:/​/​citeseerx.ist.psu.edu/​viewdoc/​summary?doi=10.1.1.35.2278.
http:/​/​citeseerx.ist.psu.edu/​viewdoc/​summary?doi=10.1.1.35.2278

[65] Laurent Michel. About small eigenvalues of the Witten Laplacian. Pure and Applied Analysis, 1 (2): 149 – 206, 2019. 10.2140/​paa.2019.1.149. URL https:/​/​doi.org/​10.2140/​paa.2019.1.149. arXiv:1702.01837.
https:/​/​doi.org/​10.2140/​paa.2019.1.149
arXiv:1702.01837

[66] Siddharth Muthukrishnan, Tameem Albash, and Daniel A. Lidar. Tunneling and speedup in quantum optimization for permutation-symmetric problems. Physical Review X, 6: 031010, Jul 2016. ISSN 2160-3308. 10.1103/​physrevx.6.031010. URL http:/​/​dx.doi.org/​10.1103/​PhysRevX.6.031010. arXiv:1511.03910.
https:/​/​doi.org/​10.1103/​physrevx.6.031010
arXiv:1511.03910

[67] Quynh Nguyen. On connected sublevel sets in deep learning. In International Conference on Machine Learning, pages 4790–4799. PMLR, 2019. URL http:/​/​proceedings.mlr.press/​v97/​nguyen19a.html. arXiv:1901.07417.
arXiv:1901.07417
http:/​/​proceedings.mlr.press/​v97/​nguyen19a.html

[68] Michael A. Nielsen and Isaac L. Chuang. Quantum Computation and Quantum Information: 10th Anniversary Edition. Cambridge University Press, 2010. 10.1017/​CBO9780511976667.
https:/​/​doi.org/​10.1017/​CBO9780511976667

[69] Grigorios A. Pavliotis. Stochastic processes and applications: diffusion processes, the Fokker-Planck and Langevin equations, volume 60. Springer, 2014. 10.1007/​978-1-4939-1323-7.
https:/​/​doi.org/​10.1007/​978-1-4939-1323-7

[70] Qing Qu, Yuexiang Zhai, Xiao Li, Yuqian Zhang, and Zhihui Zhu. Analysis of the optimization landscapes for overcomplete representation learning, 2019. arXiv:1912.02427.
arXiv:1912.02427

[71] Gianluca Rastelli. Semiclassical formula for quantum tunneling in asymmetric double-well potentials. Physical Review A, 86: 012106, Jul 2012. 10.1103/​PhysRevA.86.012106. URL https:/​/​link.aps.org/​doi/​10.1103/​PhysRevA.86.012106. arXiv:1205.0366.
https:/​/​doi.org/​10.1103/​PhysRevA.86.012106
arXiv:1205.0366

[72] Arthur G. Rattew, Yue Sun, Pierre Minssen, and Marco Pistoia. The efficient preparation of normal distributions in quantum registers. Quantum, 5: 609, 2021. 10.22331/​q-2021-12-23-609. URL https:/​/​quantum-journal.org/​papers/​q-2021-12-23-609/​. arXiv:2009.06601.
https:/​/​doi.org/​10.22331/​q-2021-12-23-609
arXiv:2009.06601
https:/​/​quantum-journal.org/​papers/​q-2021-12-23-609/​

[73] Patrick Rebentrost, Maria Schuld, Leonard Wossnig, Francesco Petruccione, and Seth Lloyd. Quantum gradient descent and Newton's method for constrained polynomial optimization. New Journal of Physics, 21 (7): 073023, 2019. 10.1088/​1367-2630/​ab2a9e. arXiv:1612.01789.
https:/​/​doi.org/​10.1088/​1367-2630/​ab2a9e
arXiv:1612.01789

[74] Burak Şahinoğlu and Rolando D. Somma. Hamiltonian simulation in the low-energy subspace. npj Quantum Information, 7 (1): 1–5, 2021. 10.1038/​s41534-021-00451-w. URL https:/​/​www.nature.com/​articles/​s41534-021-00451-w. arXiv:2006.02660.
https:/​/​doi.org/​10.1038/​s41534-021-00451-w
arXiv:2006.02660
https:/​/​www.nature.com/​articles/​s41534-021-00451-w

[75] J. M. Schmidt, A. N. Cleland, and John Clarke. Resonant tunneling in small current-biased Josephson junctions. Physical Review B, 43: 229–238, Jan 1991. 10.1103/​PhysRevB.43.229. URL https:/​/​link.aps.org/​doi/​10.1103/​PhysRevB.43.229.
https:/​/​doi.org/​10.1103/​PhysRevB.43.229

[76] Alexander Shevchenko and Marco Mondelli. Landscape connectivity and dropout stability of SGD solutions for over-parameterized neural networks. In International Conference on Machine Learning, pages 8773–8784. PMLR, 2020. URL http:/​/​proceedings.mlr.press/​v119/​shevchenko20a.html. arXiv:1912.10095.
arXiv:1912.10095
http:/​/​proceedings.mlr.press/​v119/​shevchenko20a.html

[77] Bin Shi, Weijie J. Su, and Michael I. Jordan. On learning rates and Schrödinger operators, 2020. arXiv:2004.06977.
arXiv:2004.06977

[78] Bin Shi, Simon S. Du, Michael I. Jordan, and Weijie J. Su. Understanding the acceleration phenomenon via high-resolution differential equations. Mathematical Programming, pages 1–70, 2021. 10.1007/​s10107-021-01681-8. URL https:/​/​doi.org/​10.1007/​s10107-021-01681-8. arXiv:1810.08907.
https:/​/​doi.org/​10.1007/​s10107-021-01681-8
arXiv:1810.08907

[79] Weijie Su, Stephen Boyd, and Emmanuel J. Candes. A differential equation for modeling Nesterov's accelerated gradient method: Theory and insights. The Journal of Machine Learning Research, 17 (1): 5312–5354, 2016. 10.5555/​2946645.3053435. URL https:/​/​dl.acm.org/​doi/​abs/​10.5555/​2946645.3053435. arXiv:1503.01243.
https:/​/​doi.org/​10.5555/​2946645.3053435
arXiv:1503.01243

[80] Ruoyu Sun. Optimization for deep learning: theory and algorithms, 2019. arXiv:1912.08957.
arXiv:1912.08957

[81] Kunal Talwar. Computational separations between sampling and optimization. Advances in Neural Information Processing Systems, 32: 15023–15033, 2019. URL http:/​/​papers.nips.cc/​paper/​9639-computational-separations-between-sampling-and-optimization. arXiv:1911.02074.
arXiv:1911.02074
http:/​/​papers.nips.cc/​paper/​9639-computational-separations-between-sampling-and-optimization

[82] Hao Tang, Xiao-Feng Lin, Zhen Feng, Jing-Yuan Chen, Jun Gao, Ke Sun, Chao-Yue Wang, Peng-Cheng Lai, Xiao-Yun Xu, Yao Wang, Lu-Feng Qiao, Ai-Lin Yang, and Xian-Min Jin. Experimental two-dimensional quantum walk on a photonic chip. Science advances, 4 (5): eaat3174, 2018. 10.1126/​sciadv.aat3174. URL https:/​/​www.science.org/​doi/​10.1126/​sciadv.aat3174. arXiv:1704.08242.
https:/​/​doi.org/​10.1126/​sciadv.aat3174
arXiv:1704.08242

[83] Cédric Villani. Hypocoercivity, volume 202 of Memoirs of the American Mathematical Society. American Mathematical Society, 2009. 10.1090/​S0065-9266-09-00567-5. arXiv:math/​0609050.
https:/​/​doi.org/​10.1090/​S0065-9266-09-00567-5
arXiv:math/0609050

[84] Andre Wibisono, Ashia C. Wilson, and Michael I. Jordan. A variational perspective on accelerated methods in optimization. Proceedings of the National Academy of Sciences, 113 (47): E7351–E7358, 2016. 10.1073/​pnas.1614734113. URL https:/​/​doi.org/​10.1073/​pnas.1614734113. arXiv:1603.04245.
https:/​/​doi.org/​10.1073/​pnas.1614734113
arXiv:1603.04245

[85] Chenyi Zhang and Tongyang Li. Escape saddle points by a simple gradient-descent based algorithm. In Advances in Neural Information Processing Systems, volume 34, 2021. URL https:/​/​openreview.net/​forum?id=lEf52hTHq0Q. arXiv:2111.14069.
arXiv:2111.14069
https:/​/​openreview.net/​forum?id=lEf52hTHq0Q

[86] Chenyi Zhang, Jiaqi Leng, and Tongyang Li. Quantum algorithms for escaping from saddle points. Quantum, 5: 529, 2021a. 10.22331/​q-2021-08-20-529. arXiv:2007.10253.
https:/​/​doi.org/​10.22331/​q-2021-08-20-529
arXiv:2007.10253

[87] Kaining Zhang, Min-Hsiu Hsieh, Liu Liu, and Dacheng Tao. Quantum algorithm for finding the negative curvature direction in non-convex optimization, 2019. arXiv:1909.07622.
arXiv:1909.07622

[88] Yuqian Zhang, Qing Qu, and John Wright. From symmetry to geometry: Tractable nonconvex problems, 2021b. arXiv:2007.06753.
arXiv:2007.06753

Cited by

[1] Ioannis Liliopoulos, Georgios D. Varsamis, and Ioannis G. Karafyllidis, "Discrete-time quantum walk-based optimization algorithm", Quantum Information Processing 23 1, 23 (2024).

[2] Dylan Herman, Cody Googin, Xiaoyuan Liu, Yue Sun, Alexey Galda, Ilya Safro, Marco Pistoia, and Yuri Alexeev, "Quantum computing for finance", Nature Reviews Physics 5 8, 450 (2023).

[3] Chenyi Zhang and Tongyang Li, "Quantum Lower Bounds for Finding Stationary Points of Nonconvex Functions", arXiv:2212.03906, (2022).

[4] Weiyuan Gong, Chenyi Zhang, and Tongyang Li, "Robustness of Quantum Algorithms for Nonconvex Optimization", arXiv:2212.02548, (2022).

[5] Aaron Sidford and Chenyi Zhang, "Quantum speedups for stochastic optimization", arXiv:2308.01582, (2023).

The above citations are from Crossref's cited-by service (last updated successfully 2024-02-26 11:03:49) and SAO/NASA ADS (last updated successfully 2024-02-26 11:03:50). The list may be incomplete as not all publishers provide suitable and complete citation data.