Quantum agents in the Gym: a variational quantum algorithm for deep Q-learning

Andrea Skolik1,2, Sofiene Jerbi3, and Vedran Dunjko1

1Leiden University, Niels Bohrweg 1, 2333 CA Leiden, The Netherlands
2Volkswagen Data:Lab, Ungererstraße 69, 80805 Munich, Germany
3Institute for Theoretical Physics, University of Innsbruck, Technikerstr. 21a, A-6020 Innsbruck, Austria

Find this paper interesting or want to discuss? Scite or leave a comment on SciRate.

Abstract

Quantum machine learning (QML) has been identified as one of the key fields that could reap advantages from near-term quantum devices, next to optimization and quantum chemistry. Research in this area has focused primarily on variational quantum algorithms (VQAs), and several proposals to enhance supervised, unsupervised and reinforcement learning (RL) algorithms with VQAs have been put forward. Out of the three, RL is the least studied and it is still an open question whether VQAs can be competitive with state-of-the-art classical algorithms based on neural networks (NNs) even on simple benchmark tasks. In this work, we introduce a training method for parametrized quantum circuits (PQCs) that can be used to solve RL tasks for discrete and continuous state spaces based on the deep Q-learning algorithm. We investigate which architectural choices for quantum Q-learning agents are most important for successfully solving certain types of environments by performing ablation studies for a number of different data encoding and readout strategies. We provide insight into why the performance of a VQA-based Q-learning algorithm crucially depends on the observables of the quantum model and show how to choose suitable observables based on the learning task at hand. To compare our model against the classical DQN algorithm, we perform an extensive hyperparameter search of PQCs and NNs with varying numbers of parameters. We confirm that similar to results in classical literature, the architectural choices and hyperparameters contribute more to the agents' success in a RL setting than the number of parameters used in the model. Finally, we show when recent separation results between classical and quantum agents for policy gradient RL can be extended to inferring optimal Q-values in restricted families of environments.

Deep reinforcement learning has yielded remarkable successes over the past decade, achieving superhuman levels in many seminal "AI" benchmarks such as the game of Go, Chess, poker etc. In this work we explore how techniques from deep reinforcement learning can be transferred to the realm of variational quantum algorithms for a special type of reinforcement learning algorithm called Q-learning. In essence, we propose a quantum variant of deep reinforcement learning which substitutes the neural network with a quantum analogue – a parametrized quantum circuit. We show that with careful design choices such an architecture performs well on two classical benchmark tasks from the OpenAI Gym, perform comparisons of our model to a neural network-driven system on the same learning task, and analyse the theoretical perspectives and limitations of the model. We find that the quantum learner is competitive to its classical counterpart, and prove that in some task environments one can achieve a provable exponential separation between classical and quantum Q-learners.

► BibTeX data

► References

[1] Kishor Bharti, Alba Cervera-Lierta, Thi Ha Kyaw, Tobias Haug, Sumner Alperin-Lea, Abhinav Anand, Matthias Degroote, Hermanni Heimonen, Jakob S Kottmann, Tim Menke, et al. Noisy intermediate-scale quantum (nisq) algorithms. arXiv preprint arXiv:2101.08448, 2021 doi:10.1103/​RevModPhys.94.015004.
https:/​/​doi.org/​10.1103/​RevModPhys.94.015004
arXiv:2101.08448

[2] John Preskill. Quantum computing in the nisq era and beyond. Quantum, 2:79, 2018. doi:10.22331/​q-2018-08-06-79.
https:/​/​doi.org/​10.22331/​q-2018-08-06-79

[3] Kosuke Mitarai, Makoto Negoro, Masahiro Kitagawa, and Keisuke Fujii. Quantum circuit learning. Physical Review A, 98(3):032309, 2018. doi:10.1103/​PhysRevA.98.032309.
https:/​/​doi.org/​10.1103/​PhysRevA.98.032309

[4] Maria Schuld, Alex Bocharov, Krysta M Svore, and Nathan Wiebe. Circuit-centric quantum classifiers. Physical Review A, 101(3):032308, 2020. doi:10.1103/​PhysRevA.101.032308.
https:/​/​doi.org/​10.1103/​PhysRevA.101.032308

[5] Maria Schuld and Nathan Killoran. Quantum machine learning in feature hilbert spaces. Physical review letters, 122(4):040504, 2019. doi:10.1103/​PhysRevLett.122.040504.
https:/​/​doi.org/​10.1103/​PhysRevLett.122.040504

[6] Vojtěch Havlíček, Antonio D Córcoles, Kristan Temme, Aram W Harrow, Abhinav Kandala, Jerry M Chow, and Jay M Gambetta. Supervised learning with quantum-enhanced feature spaces. Nature, 567(7747):209–212, 2019. doi:10.1038/​s41586-019-0980-2.
https:/​/​doi.org/​10.1038/​s41586-019-0980-2

[7] Edward Farhi and Hartmut Neven. Classification with quantum neural networks on near term processors. arXiv preprint arXiv:1802.06002, 2018.
arXiv:1802.06002

[8] Mohammad H Amin, Evgeny Andriyash, Jason Rolfe, Bohdan Kulchytskyy, and Roger Melko. Quantum boltzmann machine. Physical Review X, 8(2):021050, 2018. doi:10.1103/​PhysRevX.8.021050.
https:/​/​doi.org/​10.1103/​PhysRevX.8.021050

[9] Brian Coyle, Daniel Mills, Vincent Danos, and Elham Kashefi. The born supremacy: Quantum advantage and training of an ising born machine. npj Quantum Information, 6(1):1–11, 2020. doi:10.1038/​s41534-020-00288-9.
https:/​/​doi.org/​10.1038/​s41534-020-00288-9

[10] Christa Zoufal, Aurélien Lucchi, and Stefan Woerner. Variational quantum boltzmann machines. Quantum Machine Intelligence, 3(1):1–15, 2021. doi:10.1007/​s42484-020-00033-7.
https:/​/​doi.org/​10.1007/​s42484-020-00033-7

[11] Seth Lloyd and Christian Weedbrook. Quantum generative adversarial learning. Physical review letters, 121(4):040502, 2018. doi:10.1103/​PhysRevLett.121.040502.
https:/​/​doi.org/​10.1103/​PhysRevLett.121.040502

[12] Christa Zoufal, Aurélien Lucchi, and Stefan Woerner. Quantum generative adversarial networks for learning and loading random distributions. npj Quantum Information, 5(1):1–9, 2019. doi:10.1038/​s41534-019-0223-2.
https:/​/​doi.org/​10.1038/​s41534-019-0223-2

[13] Shouvanik Chakrabarti, Huang Yiming, Tongyang Li, Soheil Feizi, and Xiaodi Wu. Quantum wasserstein generative adversarial networks. In Advances in Neural Information Processing Systems, pages 6781–6792, 2019.

[14] A Hamann, V Dunjko, and S Wölk. Quantum-accessible reinforcement learning beyond strictly epochal environments. arXiv preprint arXiv:2008.01481, 2020. doi:10.1007/​s42484-021-00049-7.
https:/​/​doi.org/​10.1007/​s42484-021-00049-7
arXiv:2008.01481

[15] Sofiene Jerbi, Lea M Trenkwalder, Hendrik Poulsen Nautrup, Hans J Briegel, and Vedran Dunjko. Quantum enhancements for deep reinforcement learning in large spaces. PRX Quantum, 2(1):010328, 2021. doi:10.1103/​PRXQuantum.2.010328.
https:/​/​doi.org/​10.1103/​PRXQuantum.2.010328

[16] Samuel Yen-Chi Chen, Chao-Han Huck Yang, Jun Qi, Pin-Yu Chen, Xiaoli Ma, and Hsi-Sheng Goan. Variational quantum circuits for deep reinforcement learning. IEEE Access, 8:141007–141024, 2020. doi:10.1109/​ACCESS.2020.3010470.
https:/​/​doi.org/​10.1109/​ACCESS.2020.3010470

[17] Owen Lockwood and Mei Si. Reinforcement learning with quantum variational circuit. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, pages 245–251, 2020.

[18] Shaojun Wu, Shan Jin, Dingding Wen, and Xiaoting Wang. Quantum reinforcement learning in continuous action space. arXiv preprint arXiv:2012.10711, 2020.
arXiv:2012.10711

[19] Marcello Benedetti, Erika Lloyd, Stefan Sack, and Mattia Fiorentini. Parameterized quantum circuits as machine learning models. Quantum Science and Technology, 4(4):043001, 2019. doi:10.1088/​2058-9565/​ab4eb5.
https:/​/​doi.org/​10.1088/​2058-9565/​ab4eb5

[20] Sofiene Jerbi, Casper Gyurik, Simon Marshall, Hans Briegel, and Vedran Dunjko. Parametrized quantum policies for reinforcement learning. Advances in Neural Information Processing Systems, 34, arXiv preprint arXiv:2103.05577 2021.
arXiv:2103.05577

[21] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al. Human-level control through deep reinforcement learning. nature, 518(7540):529–533, 2015. doi:10.1038/​nature14236.
https:/​/​doi.org/​10.1038/​nature14236

[22] David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, et al. Mastering the game of go with deep neural networks and tree search. nature, 529(7587):484–489, 2016. doi:10.1038/​nature16961.
https:/​/​doi.org/​10.1038/​nature16961

[23] Christopher Berner, Greg Brockman, Brooke Chan, Vicki Cheung, Przemyslaw Debiak, Christy Dennison, David Farhi, Quirin Fischer, Shariq Hashme, Chris Hesse, et al. Dota 2 with large scale deep reinforcement learning. arXiv preprint arXiv:1912.06680, 2019.
arXiv:1912.06680

[24] Oriol Vinyals, Igor Babuschkin, Wojciech M Czarnecki, Michaël Mathieu, Andrew Dudzik, Junyoung Chung, David H Choi, Richard Powell, Timo Ewalds, Petko Georgiev, et al. Grandmaster level in starcraft ii using multi-agent reinforcement learning. Nature, 575(7782):350–354, 2019. doi:10.1038/​s41586-019-1724-z.
https:/​/​doi.org/​10.1038/​s41586-019-1724-z

[25] Vijay R Konda and John N Tsitsiklis. Actor-critic algorithms. In Advances in neural information processing systems, pages 1008–1014, 2000.

[26] Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. Asynchronous methods for deep reinforcement learning. In International conference on machine learning, pages 1928–1937. PMLR, 2016.

[27] Christopher John Cornish Hellaby Watkins. Learning from delayed rewards. 1989.

[28] Leslie N Smith. A disciplined approach to neural network hyper-parameters: Part 1–learning rate, batch size, momentum, and weight decay. arXiv preprint arXiv:1803.09820, 2018.
arXiv:1803.09820

[29] Ziyu Ye, Andrew Gilman, Qihang Peng, Kelly Levick, Pamela Cosman, and Larry Milstein. Comparison of neural network architectures for spectrum sensing. In 2019 IEEE Globecom Workshops (GC Wkshps), pages 1–6. IEEE, 2019. doi:10.1109/​GCWkshps45667.2019.9024482.
https:/​/​doi.org/​10.1109/​GCWkshps45667.2019.9024482

[30] Hao Yu, Tiantian Xie, Michael Hamilton, and Bogdan Wilamowski. Comparison of different neural network architectures for digit image recognition. In 2011 4th International Conference on Human System Interactions, HSI 2011, pages 98–103. IEEE, 2011. doi:10.1109/​HSI.2011.5937350.
https:/​/​doi.org/​10.1109/​HSI.2011.5937350

[31] F Cordoni. A comparison of modern deep neural network architectures for energy spot price forecasting. Digital Finance, 2:189–210, 2020. doi:10.1007/​s42521-020-00022-2.
https:/​/​doi.org/​10.1007/​s42521-020-00022-2

[32] Tomasz Szandała. Review and comparison of commonly used activation functions for deep neural networks. In Bio-inspired Neurocomputing, pages 203–224. Springer, 2021.

[33] Chigozie Nwankpa, Winifred Ijomah, Anthony Gachagan, and Stephen Marshall. Activation functions: Comparison of trends in practice and research for deep learning. arXiv preprint arXiv:1811.03378, 2018.
arXiv:1811.03378

[34] Sebastian Urban. Neural network architectures and activation functions: A gaussian process approach. PhD thesis, Technische Universität München, 2018.

[35] Leslie N Smith. Cyclical learning rates for training neural networks. In 2017 IEEE winter conference on applications of computer vision (WACV), pages 464–472. IEEE, 2017. doi:10.1109/​WACV.2017.58.
https:/​/​doi.org/​10.1109/​WACV.2017.58

[36] Thomas Elsken, Jan Hendrik Metzen, and Frank Hutter. Neural architecture search: A survey. The Journal of Machine Learning Research, 20(1):1997–2017, 2019.

[37] Frank Hutter, Lars Kotthoff, and Joaquin Vanschoren. Automated machine learning: methods, systems, challenges. Springer Nature, 2019. doi:10.1007/​978-3-030-05318-5.
https:/​/​doi.org/​10.1007/​978-3-030-05318-5

[38] Jarrod R McClean, Sergio Boixo, Vadim N Smelyanskiy, Ryan Babbush, and Hartmut Neven. Barren plateaus in quantum neural network training landscapes. Nature communications, 9(1):1–6, 2018. doi:10.1038/​s41467-018-07090-4.
https:/​/​doi.org/​10.1038/​s41467-018-07090-4

[39] Bobak Toussi Kiani, Seth Lloyd, and Reevu Maity. Learning unitaries by gradient descent. arXiv preprint arXiv:2001.11897, 2020.
arXiv:2001.11897

[40] Roeland Wiersema, Cunlu Zhou, Yvette de Sereville, Juan Felipe Carrasquilla, Yong Baek Kim, and Henry Yuen. Exploring entanglement and optimization within the hamiltonian variational ansatz. PRX Quantum, 1(2):020319, 2020. doi:10.1103/​PRXQuantum.1.020319.
https:/​/​doi.org/​10.1103/​PRXQuantum.1.020319

[41] M Cerezo, Akira Sone, Tyler Volkoff, Lukasz Cincio, and Patrick J Coles. Cost function dependent barren plateaus in shallow parametrized quantum circuits. Nature Communications, 12(1):1–12, 2021. doi:10.1038/​s41467-021-21728-w.
https:/​/​doi.org/​10.1038/​s41467-021-21728-w

[42] Samson Wang, Enrico Fontana, Marco Cerezo, Kunal Sharma, Akira Sone, Lukasz Cincio, and Patrick J Coles. Noise-induced barren plateaus in variational quantum algorithms. Nature communications, 12(1):1–11, 2021. doi:10.1038/​s41467-021-27045-6.
https:/​/​doi.org/​10.1038/​s41467-021-27045-6

[43] Andrea Skolik, Jarrod R McClean, Masoud Mohseni, Patrick van der Smagt, and Martin Leib. Layerwise learning for quantum neural networks. Quantum Machine Intelligence, 3 (1):1–11, 2021. doi:10.1007/​s42484-020-00036-4.
https:/​/​doi.org/​10.1007/​s42484-020-00036-4

[44] Carlos Ortiz Marrero, Mária Kieferová, and Nathan Wiebe. Entanglement-induced barren plateaus. PRX Quantum, 2(4):040316, 2021. doi:10.1103/​PRXQuantum.2.040316.
https:/​/​doi.org/​10.1103/​PRXQuantum.2.040316

[45] Sukin Sim, Peter D Johnson, and Alán Aspuru-Guzik. Expressibility and entangling capability of parameterized quantum circuits for hybrid quantum-classical algorithms. Advanced Quantum Technologies, 2(12):1900070, 2019. doi:10.1002/​qute.201900070.
https:/​/​doi.org/​10.1002/​qute.201900070

[46] Sukin Sim, Jhonathan Romero Fontalvo, Jérôme F Gonthier, and Alexander A Kunitsa. Adaptive pruning-based optimization of parameterized quantum circuits. Quantum Science and Technology, 2021. doi:10.1088/​2058-9565/​abe107.
https:/​/​doi.org/​10.1088/​2058-9565/​abe107

[47] Xiaoyuan Liu, Anthony Angone, Ruslan Shaydulin, Ilya Safro, Yuri Alexeev, and Lukasz Cincio. Layer vqe: A variational approach for combinatorial optimization on noisy quantum computers. arXiv preprint arXiv:2102.05566, 2021. doi:10.1109/​TQE.2021.3140190.
https:/​/​doi.org/​10.1109/​TQE.2021.3140190
arXiv:2102.05566

[48] Maria Schuld, Ryan Sweke, and Johannes Jakob Meyer. Effect of data encoding on the expressive power of variational quantum-machine-learning models. Physical Review A, 103(3):032430, 2021. doi:10.1103/​PhysRevA.103.032430.
https:/​/​doi.org/​10.1103/​PhysRevA.103.032430

[49] Openai gym wiki, cartpole v0. URL: https:/​/​github.com/​openai/​gym/​wiki/​CartPole-v0.
https:/​/​github.com/​openai/​gym/​wiki/​CartPole-v0

[50] Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. Openai gym. arXiv preprint arXiv:1606.01540, 2016.
arXiv:1606.01540

[51] Adrián Pérez-Salinas, Alba Cervera-Lierta, Elies Gil-Fuster, and José I Latorre. Data re-uploading for a universal quantum classifier. Quantum, 4:226, 2020. doi:10.22331/​q-2020-02-06-226.
https:/​/​doi.org/​10.22331/​q-2020-02-06-226

[52] Kei Ota, Devesh K Jha, and Asako Kanezaki. Training larger networks for deep reinforcement learning. arXiv preprint arXiv:2102.07920, 2021.
arXiv:2102.07920

[53] Code used in this work https:/​/​github.com/​askolik/​quantum_agents. URL: https:/​/​github.com/​askolik/​quantum_agents.
https:/​/​github.com/​askolik/​quantum_agents

[54] Richard S Sutton and Andrew G Barto. Reinforcement learning: An introduction. MIT press, 2018. doi:10.1109/​TNN.1998.712192.
https:/​/​doi.org/​10.1109/​TNN.1998.712192

[55] Richard S Sutton, David A McAllester, Satinder P Singh, Yishay Mansour, et al. Policy gradient methods for reinforcement learning with function approximation. In NIPs, volume 99, pages 1057–1063. Citeseer, 1999.

[56] Evan Greensmith, Peter L Bartlett, and Jonathan Baxter. Variance reduction techniques for gradient estimates in reinforcement learning. Journal of Machine Learning Research, 5(9), 2004.

[57] Francisco S Melo. Convergence of q-learning: A simple proof. Institute Of Systems and Robotics, Tech. Rep, pages 1–4, 2001.

[58] Long-Ji Lin. Self-supervised Learning by Reinforcement and Artificial Neural Networks. PhD thesis, Carnegie Mellon University, School of Computer Science, 1992.

[59] Francisco S Melo and M Isabel Ribeiro. Q-learning with linear function approximation. In International Conference on Computational Learning Theory, pages 308–322. Springer, 2007. doi:10.1007/​978-3-540-72927-3_23.
https:/​/​doi.org/​10.1007/​978-3-540-72927-3_23

[60] Abhinav Kandala, Antonio Mezzacapo, Kristan Temme, Maika Takita, Markus Brink, Jerry M Chow, and Jay M Gambetta. Hardware-efficient variational quantum eigensolver for small molecules and quantum magnets. Nature, 549(7671):242–246, 2017. doi:10.1038/​nature23879.
https:/​/​doi.org/​10.1038/​nature23879

[61] Yunchao Liu, Srinivasan Arunachalam, and Kristan Temme. A rigorous and robust quantum speed-up in supervised machine learning. Nature Physics, pages 1–5, 2021. doi:10.1038/​s41567-021-01287-z.
https:/​/​doi.org/​10.1038/​s41567-021-01287-z

[62] Vedran Dunjko, Yi-Kai Liu, Xingyao Wu, and Jacob M Taylor. Exponential improvements for quantum-accessible reinforcement learning. arXiv preprint arXiv:1710.11160, 2017.
arXiv:1710.11160

[63] Peter W Shor. Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM review, 41(2):303–332, 1999. doi:10.1137/​S0036144598347011.
https:/​/​doi.org/​10.1137/​S0036144598347011

[64] Openai gym wiki, frozen lake v0. URL: https:/​/​github.com/​openai/​gym/​wiki/​FrozenLake-v0.
https:/​/​github.com/​openai/​gym/​wiki/​FrozenLake-v0

[65] Michael Broughton, Guillaume Verdon, Trevor McCourt, Antonio J Martinez, Jae Hyeon Yoo, Sergei V Isakov, Philip Massey, Murphy Yuezhen Niu, Ramin Halavati, Evan Peters, et al. Tensorflow quantum: A software framework for quantum machine learning. arXiv preprint arXiv:2003.02989, 2020.
arXiv:2003.02989

[66] Cirq, https:/​/​quantumai.google/​cirq. URL: https:/​/​quantumai.google/​cirq.
https:/​/​quantumai.google/​cirq

[67] Openai gym leaderboard. URL: https:/​/​github.com/​openai/​gym/​wiki/​Leaderboard.
https:/​/​github.com/​openai/​gym/​wiki/​Leaderboard

[68] Jin-Guo Liu and Lei Wang. Differentiable learning of quantum circuit born machines. Physical Review A, 98(6):062324, 2018. doi:10.1103/​PhysRevA.98.062324.
https:/​/​doi.org/​10.1103/​PhysRevA.98.062324

Cited by

[1] Ruilin Liu, Sebastián V. Romero, Izaskun Oregi, Eneko Osaba, Esther Villar-Rodriguez, and Yue Ban, "Digital Quantum Simulation and Circuit Learning for the Generation of Coherent States", Entropy 24 11, 1529 (2022).

[2] Maja Franz, Lucas Wolf, Maniraman Periyasamy, Christian Ufrecht, Daniel D. Scherer, Axel Plinge, Christopher Mutschler, and Wolfgang Mauerer, "Uncovering instabilities in variational-quantum deep Q-networks", arXiv:2202.05195, Journal of the Franklin Institute (2022).

[3] Eva Andrés, Manuel Pegalajar Cuéllar, and Gabriel Navarro, "On the Use of Quantum Reinforcement Learning in Energy-Efficiency Scenarios", Energies 15 16, 6034 (2022).

[4] M. Cerezo, Guillaume Verdon, Hsin-Yuan Huang, Lukasz Cincio, and Patrick J. Coles, "Challenges and opportunities in quantum machine learning", Nature Computational Science 2 9, 567 (2022).

[5] Manuel Schonberger, Maja Franz, Stefanie Scherzinger, and Wolfgang Mauerer, 2022 IEEE 19th International Conference on Software Architecture Companion (ICSA-C) 164 (2022) ISBN:978-1-6654-9493-9.

[6] Tomoaki Kimura, Kodai Shiba, Chih-Chieh Chen, Masaru Sogabe, Katsuyoshi Sakamoto, and Tomah Sogabe, "Quantum circuit architectures via quantum observable Markov decision process planning", Journal of Physics Communications 6 7, 075006 (2022).

[7] Charles Moussa, Jan N. van Rijn, Thomas Bäck, and Vedran Dunjko, Lecture Notes in Computer Science 13601, 32 (2022) ISBN:978-3-031-18839-8.

[8] Anna Dawid, Julian Arnold, Borja Requena, Alexander Gresch, Marcin Płodzień, Kaelan Donatella, Kim A. Nicoli, Paolo Stornati, Rouven Koch, Miriam Büttner, Robert Okuła, Gorka Muñoz-Gil, Rodrigo A. Vargas-Hernández, Alba Cervera-Lierta, Juan Carrasquilla, Vedran Dunjko, Marylou Gabrié, Patrick Huembeli, Evert van Nieuwenburg, Filippo Vicentini, Lei Wang, Sebastian J. Wetzel, Giuseppe Carleo, Eliška Greplová, Roman Krems, Florian Marquardt, Michał Tomza, Maciej Lewenstein, and Alexandre Dauphin, "Modern applications of machine learning in quantum sciences", arXiv:2204.04198.

[9] Michael Broughton, Guillaume Verdon, Trevor McCourt, Antonio J. Martinez, Jae Hyeon Yoo, Sergei V. Isakov, Philip Massey, Ramin Halavati, Murphy Yuezhen Niu, Alexander Zlokapa, Evan Peters, Owen Lockwood, Andrea Skolik, Sofiene Jerbi, Vedran Dunjko, Martin Leib, Michael Streif, David Von Dollen, Hongxiang Chen, Shuxiang Cao, Roeland Wiersema, Hsin-Yuan Huang, Jarrod R. McClean, Ryan Babbush, Sergio Boixo, Dave Bacon, Alan K. Ho, Hartmut Neven, and Masoud Mohseni, "TensorFlow Quantum: A Software Framework for Quantum Machine Learning", arXiv:2003.02989.

[10] Samuel Yen-Chi Chen, Chih-Min Huang, Chia-Wei Hsing, Hsi-Sheng Goan, and Ying-Jer Kao, "Variational quantum reinforcement learning via evolutionary optimization", Machine Learning: Science and Technology 3 1, 015025 (2022).

[11] Andrea Skolik, Michele Cattelan, Sheir Yarkoni, Thomas Bäck, and Vedran Dunjko, "Equivariant quantum circuits for learning on weighted graphs", arXiv:2205.06109.

[12] Asel Sagingalieva, Andrii Kurkin, Artem Melnikov, Daniil Kuhmistrov, Michael Perelshtein, Alexey Melnikov, Andrea Skolik, and David Von Dollen, "Hyperparameter optimization of hybrid quantum neural networks for car classification", arXiv:2205.04878.

[13] En-Jui Kuo, Yao-Lung L. Fang, and Samuel Yen-Chi Chen, "Quantum Architecture Search via Deep Reinforcement Learning", arXiv:2104.07715.

[14] Sofiene Jerbi, Lukas J. Fiderer, Hendrik Poulsen Nautrup, Jonas M. Kübler, Hans J. Briegel, and Vedran Dunjko, "Quantum machine learning beyond kernel methods", arXiv:2110.13162.

[15] Sofiene Jerbi, Casper Gyurik, Simon C. Marshall, Hans J. Briegel, and Vedran Dunjko, "Parametrized quantum policies for reinforcement learning", arXiv:2103.05577.

[16] Jen-Yueh Hsiao, Yuxuan Du, Wei-Yin Chiang, Min-Hsiu Hsieh, and Hsi-Sheng Goan, "Unentangled quantum reinforcement learning agents in the OpenAI Gym", arXiv:2203.14348.

[17] Quynh T. Nguyen, Louis Schatzki, Paolo Braccia, Michael Ragone, Patrick J. Coles, Frederic Sauvage, Martin Larocca, and M. Cerezo, "Theory for Equivariant Quantum Neural Networks", arXiv:2210.08566.

[18] Esther Ye and Samuel Yen-Chi Chen, "Quantum Architecture Search via Continual Reinforcement Learning", arXiv:2112.05779.

[19] Jun Qi, Chao-Han Huck Yang, and Pin-Yu Chen, "QTN-VQC: An End-to-End Learning framework for Quantum Neural Networks", arXiv:2110.03861.

[20] Christa Zoufal, "Generative Quantum Machine Learning", arXiv:2111.12738.

[21] Qingfeng Lan, "Variational Quantum Soft Actor-Critic", arXiv:2112.11921.

[22] Louis Schatzki, Martin Larocca, Quynh T. Nguyen, Frederic Sauvage, and M. Cerezo, "Theoretical Guarantees for Permutation-Equivariant Quantum Neural Networks", arXiv:2210.09974.

[23] Arjan Cornelissen, Yassine Hamoudi, and Sofiene Jerbi, "Near-Optimal Quantum Algorithms for Multivariate Mean Estimation", arXiv:2111.09787.

[24] Brian Coyle, "Machine learning applications for noisy intermediate-scale quantum computers", arXiv:2205.09414.

[25] Owen Lockwood, "An Empirical Review of Optimization Techniques for Quantum Variational Circuits", arXiv:2202.01389.

[26] Maniraman Periyasamy, Nico Meyer, Christian Ufrecht, Daniel D. Scherer, Axel Plinge, and Christopher Mutschler, "Incremental Data-Uploading for Full-Quantum Classification", arXiv:2205.03057.

[27] El Amine Cherrat, Iordanis Kerenidis, and Anupam Prakash, "Quantum Reinforcement Learning via Policy Iteration", arXiv:2203.01889.

[28] Arjan Cornelissen and Sofiene Jerbi, "Quantum algorithms for multivariate Monte Carlo estimation", arXiv:2107.03410.

[29] Dániel Nagy, Zsolt Tabi, Péter Hága, Zsófia Kallus, and Zoltán Zimborás, "Photonic Quantum Policy Learning in OpenAI Gym", arXiv:2108.12926.

[30] Li Ding and Lee Spector, "Evolutionary Quantum Architecture Search for Parametrized Quantum Circuits", arXiv:2208.11167.

[31] Charles Moussa, Jan N. van Rijn, Thomas Bäck, and Vedran Dunjko, "Hyperparameter Importance of Quantum Neural Networks Across Small Datasets", arXiv:2206.09992.

[32] Nancy Barraza, Gabriel Alvarado Barrios, Jie Peng, Lucas Lamata, Enrique Solano, and Francisco Albarrán-Arriagada, "Analog quantum approximate optimization algorithm", Quantum Science and Technology 7 4, 045035 (2022).

[33] Manuel Schönberger, Maja Franz, Stefanie Scherzinger, and Wolfgang Mauerer, "Peel $\mid$ Pile? Cross-Framework Portability of Quantum Software", arXiv:2203.06289.

[34] Simon Wiedemann, Daniel Hein, Steffen Udluft, and Christian Mendl, "Quantum Policy Iteration via Amplitude Estimation and Grover Search -- Towards Quantum Advantage for Reinforcement Learning", arXiv:2206.04741.

The above citations are from Crossref's cited-by service (last updated successfully 2022-12-08 03:48:59) and SAO/NASA ADS (last updated successfully 2022-12-08 03:49:01). The list may be incomplete as not all publishers provide suitable and complete citation data.