Quantum agents in the Gym: a variational quantum algorithm for deep Q-learning

Andrea Skolik1,2, Sofiene Jerbi3, and Vedran Dunjko1

1Leiden University, Niels Bohrweg 1, 2333 CA Leiden, The Netherlands
2Volkswagen Data:Lab, Ungererstraße 69, 80805 Munich, Germany
3Institute for Theoretical Physics, University of Innsbruck, Technikerstr. 21a, A-6020 Innsbruck, Austria

Find this paper interesting or want to discuss? Scite or leave a comment on SciRate.


Quantum machine learning (QML) has been identified as one of the key fields that could reap advantages from near-term quantum devices, next to optimization and quantum chemistry. Research in this area has focused primarily on variational quantum algorithms (VQAs), and several proposals to enhance supervised, unsupervised and reinforcement learning (RL) algorithms with VQAs have been put forward. Out of the three, RL is the least studied and it is still an open question whether VQAs can be competitive with state-of-the-art classical algorithms based on neural networks (NNs) even on simple benchmark tasks. In this work, we introduce a training method for parametrized quantum circuits (PQCs) that can be used to solve RL tasks for discrete and continuous state spaces based on the deep Q-learning algorithm. We investigate which architectural choices for quantum Q-learning agents are most important for successfully solving certain types of environments by performing ablation studies for a number of different data encoding and readout strategies. We provide insight into why the performance of a VQA-based Q-learning algorithm crucially depends on the observables of the quantum model and show how to choose suitable observables based on the learning task at hand. To compare our model against the classical DQN algorithm, we perform an extensive hyperparameter search of PQCs and NNs with varying numbers of parameters. We confirm that similar to results in classical literature, the architectural choices and hyperparameters contribute more to the agents' success in a RL setting than the number of parameters used in the model. Finally, we show when recent separation results between classical and quantum agents for policy gradient RL can be extended to inferring optimal Q-values in restricted families of environments.

Deep reinforcement learning has yielded remarkable successes over the past decade, achieving superhuman levels in many seminal "AI" benchmarks such as the game of Go, Chess, poker etc. In this work we explore how techniques from deep reinforcement learning can be transferred to the realm of variational quantum algorithms for a special type of reinforcement learning algorithm called Q-learning. In essence, we propose a quantum variant of deep reinforcement learning which substitutes the neural network with a quantum analogue – a parametrized quantum circuit. We show that with careful design choices such an architecture performs well on two classical benchmark tasks from the OpenAI Gym, perform comparisons of our model to a neural network-driven system on the same learning task, and analyse the theoretical perspectives and limitations of the model. We find that the quantum learner is competitive to its classical counterpart, and prove that in some task environments one can achieve a provable exponential separation between classical and quantum Q-learners.

► BibTeX data

► References

[1] Kishor Bharti, Alba Cervera-Lierta, Thi Ha Kyaw, Tobias Haug, Sumner Alperin-Lea, Abhinav Anand, Matthias Degroote, Hermanni Heimonen, Jakob S Kottmann, Tim Menke, et al. Noisy intermediate-scale quantum (nisq) algorithms. arXiv preprint arXiv:2101.08448, 2021 doi:10.1103/​RevModPhys.94.015004.

[2] John Preskill. Quantum computing in the nisq era and beyond. Quantum, 2:79, 2018. doi:10.22331/​q-2018-08-06-79.

[3] Kosuke Mitarai, Makoto Negoro, Masahiro Kitagawa, and Keisuke Fujii. Quantum circuit learning. Physical Review A, 98(3):032309, 2018. doi:10.1103/​PhysRevA.98.032309.

[4] Maria Schuld, Alex Bocharov, Krysta M Svore, and Nathan Wiebe. Circuit-centric quantum classifiers. Physical Review A, 101(3):032308, 2020. doi:10.1103/​PhysRevA.101.032308.

[5] Maria Schuld and Nathan Killoran. Quantum machine learning in feature hilbert spaces. Physical review letters, 122(4):040504, 2019. doi:10.1103/​PhysRevLett.122.040504.

[6] Vojtěch Havlíček, Antonio D Córcoles, Kristan Temme, Aram W Harrow, Abhinav Kandala, Jerry M Chow, and Jay M Gambetta. Supervised learning with quantum-enhanced feature spaces. Nature, 567(7747):209–212, 2019. doi:10.1038/​s41586-019-0980-2.

[7] Edward Farhi and Hartmut Neven. Classification with quantum neural networks on near term processors. arXiv preprint arXiv:1802.06002, 2018.

[8] Mohammad H Amin, Evgeny Andriyash, Jason Rolfe, Bohdan Kulchytskyy, and Roger Melko. Quantum boltzmann machine. Physical Review X, 8(2):021050, 2018. doi:10.1103/​PhysRevX.8.021050.

[9] Brian Coyle, Daniel Mills, Vincent Danos, and Elham Kashefi. The born supremacy: Quantum advantage and training of an ising born machine. npj Quantum Information, 6(1):1–11, 2020. doi:10.1038/​s41534-020-00288-9.

[10] Christa Zoufal, Aurélien Lucchi, and Stefan Woerner. Variational quantum boltzmann machines. Quantum Machine Intelligence, 3(1):1–15, 2021. doi:10.1007/​s42484-020-00033-7.

[11] Seth Lloyd and Christian Weedbrook. Quantum generative adversarial learning. Physical review letters, 121(4):040502, 2018. doi:10.1103/​PhysRevLett.121.040502.

[12] Christa Zoufal, Aurélien Lucchi, and Stefan Woerner. Quantum generative adversarial networks for learning and loading random distributions. npj Quantum Information, 5(1):1–9, 2019. doi:10.1038/​s41534-019-0223-2.

[13] Shouvanik Chakrabarti, Huang Yiming, Tongyang Li, Soheil Feizi, and Xiaodi Wu. Quantum wasserstein generative adversarial networks. In Advances in Neural Information Processing Systems, pages 6781–6792, 2019.

[14] A Hamann, V Dunjko, and S Wölk. Quantum-accessible reinforcement learning beyond strictly epochal environments. arXiv preprint arXiv:2008.01481, 2020. doi:10.1007/​s42484-021-00049-7.

[15] Sofiene Jerbi, Lea M Trenkwalder, Hendrik Poulsen Nautrup, Hans J Briegel, and Vedran Dunjko. Quantum enhancements for deep reinforcement learning in large spaces. PRX Quantum, 2(1):010328, 2021. doi:10.1103/​PRXQuantum.2.010328.

[16] Samuel Yen-Chi Chen, Chao-Han Huck Yang, Jun Qi, Pin-Yu Chen, Xiaoli Ma, and Hsi-Sheng Goan. Variational quantum circuits for deep reinforcement learning. IEEE Access, 8:141007–141024, 2020. doi:10.1109/​ACCESS.2020.3010470.

[17] Owen Lockwood and Mei Si. Reinforcement learning with quantum variational circuit. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, pages 245–251, 2020.

[18] Shaojun Wu, Shan Jin, Dingding Wen, and Xiaoting Wang. Quantum reinforcement learning in continuous action space. arXiv preprint arXiv:2012.10711, 2020.

[19] Marcello Benedetti, Erika Lloyd, Stefan Sack, and Mattia Fiorentini. Parameterized quantum circuits as machine learning models. Quantum Science and Technology, 4(4):043001, 2019. doi:10.1088/​2058-9565/​ab4eb5.

[20] Sofiene Jerbi, Casper Gyurik, Simon Marshall, Hans Briegel, and Vedran Dunjko. Parametrized quantum policies for reinforcement learning. Advances in Neural Information Processing Systems, 34, arXiv preprint arXiv:2103.05577 2021.

[21] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al. Human-level control through deep reinforcement learning. nature, 518(7540):529–533, 2015. doi:10.1038/​nature14236.

[22] David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, et al. Mastering the game of go with deep neural networks and tree search. nature, 529(7587):484–489, 2016. doi:10.1038/​nature16961.

[23] Christopher Berner, Greg Brockman, Brooke Chan, Vicki Cheung, Przemyslaw Debiak, Christy Dennison, David Farhi, Quirin Fischer, Shariq Hashme, Chris Hesse, et al. Dota 2 with large scale deep reinforcement learning. arXiv preprint arXiv:1912.06680, 2019.

[24] Oriol Vinyals, Igor Babuschkin, Wojciech M Czarnecki, Michaël Mathieu, Andrew Dudzik, Junyoung Chung, David H Choi, Richard Powell, Timo Ewalds, Petko Georgiev, et al. Grandmaster level in starcraft ii using multi-agent reinforcement learning. Nature, 575(7782):350–354, 2019. doi:10.1038/​s41586-019-1724-z.

[25] Vijay R Konda and John N Tsitsiklis. Actor-critic algorithms. In Advances in neural information processing systems, pages 1008–1014, 2000.

[26] Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. Asynchronous methods for deep reinforcement learning. In International conference on machine learning, pages 1928–1937. PMLR, 2016.

[27] Christopher John Cornish Hellaby Watkins. Learning from delayed rewards. 1989.

[28] Leslie N Smith. A disciplined approach to neural network hyper-parameters: Part 1–learning rate, batch size, momentum, and weight decay. arXiv preprint arXiv:1803.09820, 2018.

[29] Ziyu Ye, Andrew Gilman, Qihang Peng, Kelly Levick, Pamela Cosman, and Larry Milstein. Comparison of neural network architectures for spectrum sensing. In 2019 IEEE Globecom Workshops (GC Wkshps), pages 1–6. IEEE, 2019. doi:10.1109/​GCWkshps45667.2019.9024482.

[30] Hao Yu, Tiantian Xie, Michael Hamilton, and Bogdan Wilamowski. Comparison of different neural network architectures for digit image recognition. In 2011 4th International Conference on Human System Interactions, HSI 2011, pages 98–103. IEEE, 2011. doi:10.1109/​HSI.2011.5937350.

[31] F Cordoni. A comparison of modern deep neural network architectures for energy spot price forecasting. Digital Finance, 2:189–210, 2020. doi:10.1007/​s42521-020-00022-2.

[32] Tomasz Szandała. Review and comparison of commonly used activation functions for deep neural networks. In Bio-inspired Neurocomputing, pages 203–224. Springer, 2021.

[33] Chigozie Nwankpa, Winifred Ijomah, Anthony Gachagan, and Stephen Marshall. Activation functions: Comparison of trends in practice and research for deep learning. arXiv preprint arXiv:1811.03378, 2018.

[34] Sebastian Urban. Neural network architectures and activation functions: A gaussian process approach. PhD thesis, Technische Universität München, 2018.

[35] Leslie N Smith. Cyclical learning rates for training neural networks. In 2017 IEEE winter conference on applications of computer vision (WACV), pages 464–472. IEEE, 2017. doi:10.1109/​WACV.2017.58.

[36] Thomas Elsken, Jan Hendrik Metzen, and Frank Hutter. Neural architecture search: A survey. The Journal of Machine Learning Research, 20(1):1997–2017, 2019.

[37] Frank Hutter, Lars Kotthoff, and Joaquin Vanschoren. Automated machine learning: methods, systems, challenges. Springer Nature, 2019. doi:10.1007/​978-3-030-05318-5.

[38] Jarrod R McClean, Sergio Boixo, Vadim N Smelyanskiy, Ryan Babbush, and Hartmut Neven. Barren plateaus in quantum neural network training landscapes. Nature communications, 9(1):1–6, 2018. doi:10.1038/​s41467-018-07090-4.

[39] Bobak Toussi Kiani, Seth Lloyd, and Reevu Maity. Learning unitaries by gradient descent. arXiv preprint arXiv:2001.11897, 2020.

[40] Roeland Wiersema, Cunlu Zhou, Yvette de Sereville, Juan Felipe Carrasquilla, Yong Baek Kim, and Henry Yuen. Exploring entanglement and optimization within the hamiltonian variational ansatz. PRX Quantum, 1(2):020319, 2020. doi:10.1103/​PRXQuantum.1.020319.

[41] M Cerezo, Akira Sone, Tyler Volkoff, Lukasz Cincio, and Patrick J Coles. Cost function dependent barren plateaus in shallow parametrized quantum circuits. Nature Communications, 12(1):1–12, 2021. doi:10.1038/​s41467-021-21728-w.

[42] Samson Wang, Enrico Fontana, Marco Cerezo, Kunal Sharma, Akira Sone, Lukasz Cincio, and Patrick J Coles. Noise-induced barren plateaus in variational quantum algorithms. Nature communications, 12(1):1–11, 2021. doi:10.1038/​s41467-021-27045-6.

[43] Andrea Skolik, Jarrod R McClean, Masoud Mohseni, Patrick van der Smagt, and Martin Leib. Layerwise learning for quantum neural networks. Quantum Machine Intelligence, 3 (1):1–11, 2021. doi:10.1007/​s42484-020-00036-4.

[44] Carlos Ortiz Marrero, Mária Kieferová, and Nathan Wiebe. Entanglement-induced barren plateaus. PRX Quantum, 2(4):040316, 2021. doi:10.1103/​PRXQuantum.2.040316.

[45] Sukin Sim, Peter D Johnson, and Alán Aspuru-Guzik. Expressibility and entangling capability of parameterized quantum circuits for hybrid quantum-classical algorithms. Advanced Quantum Technologies, 2(12):1900070, 2019. doi:10.1002/​qute.201900070.

[46] Sukin Sim, Jhonathan Romero Fontalvo, Jérôme F Gonthier, and Alexander A Kunitsa. Adaptive pruning-based optimization of parameterized quantum circuits. Quantum Science and Technology, 2021. doi:10.1088/​2058-9565/​abe107.

[47] Xiaoyuan Liu, Anthony Angone, Ruslan Shaydulin, Ilya Safro, Yuri Alexeev, and Lukasz Cincio. Layer vqe: A variational approach for combinatorial optimization on noisy quantum computers. arXiv preprint arXiv:2102.05566, 2021. doi:10.1109/​TQE.2021.3140190.

[48] Maria Schuld, Ryan Sweke, and Johannes Jakob Meyer. Effect of data encoding on the expressive power of variational quantum-machine-learning models. Physical Review A, 103(3):032430, 2021. doi:10.1103/​PhysRevA.103.032430.

[49] Openai gym wiki, cartpole v0. URL: https:/​/​github.com/​openai/​gym/​wiki/​CartPole-v0.

[50] Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. Openai gym. arXiv preprint arXiv:1606.01540, 2016.

[51] Adrián Pérez-Salinas, Alba Cervera-Lierta, Elies Gil-Fuster, and José I Latorre. Data re-uploading for a universal quantum classifier. Quantum, 4:226, 2020. doi:10.22331/​q-2020-02-06-226.

[52] Kei Ota, Devesh K Jha, and Asako Kanezaki. Training larger networks for deep reinforcement learning. arXiv preprint arXiv:2102.07920, 2021.

[53] Code used in this work https:/​/​github.com/​askolik/​quantum_agents. URL: https:/​/​github.com/​askolik/​quantum_agents.

[54] Richard S Sutton and Andrew G Barto. Reinforcement learning: An introduction. MIT press, 2018. doi:10.1109/​TNN.1998.712192.

[55] Richard S Sutton, David A McAllester, Satinder P Singh, Yishay Mansour, et al. Policy gradient methods for reinforcement learning with function approximation. In NIPs, volume 99, pages 1057–1063. Citeseer, 1999.

[56] Evan Greensmith, Peter L Bartlett, and Jonathan Baxter. Variance reduction techniques for gradient estimates in reinforcement learning. Journal of Machine Learning Research, 5(9), 2004.

[57] Francisco S Melo. Convergence of q-learning: A simple proof. Institute Of Systems and Robotics, Tech. Rep, pages 1–4, 2001.

[58] Long-Ji Lin. Self-supervised Learning by Reinforcement and Artificial Neural Networks. PhD thesis, Carnegie Mellon University, School of Computer Science, 1992.

[59] Francisco S Melo and M Isabel Ribeiro. Q-learning with linear function approximation. In International Conference on Computational Learning Theory, pages 308–322. Springer, 2007. doi:10.1007/​978-3-540-72927-3_23.

[60] Abhinav Kandala, Antonio Mezzacapo, Kristan Temme, Maika Takita, Markus Brink, Jerry M Chow, and Jay M Gambetta. Hardware-efficient variational quantum eigensolver for small molecules and quantum magnets. Nature, 549(7671):242–246, 2017. doi:10.1038/​nature23879.

[61] Yunchao Liu, Srinivasan Arunachalam, and Kristan Temme. A rigorous and robust quantum speed-up in supervised machine learning. Nature Physics, pages 1–5, 2021. doi:10.1038/​s41567-021-01287-z.

[62] Vedran Dunjko, Yi-Kai Liu, Xingyao Wu, and Jacob M Taylor. Exponential improvements for quantum-accessible reinforcement learning. arXiv preprint arXiv:1710.11160, 2017.

[63] Peter W Shor. Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM review, 41(2):303–332, 1999. doi:10.1137/​S0036144598347011.

[64] Openai gym wiki, frozen lake v0. URL: https:/​/​github.com/​openai/​gym/​wiki/​FrozenLake-v0.

[65] Michael Broughton, Guillaume Verdon, Trevor McCourt, Antonio J Martinez, Jae Hyeon Yoo, Sergei V Isakov, Philip Massey, Murphy Yuezhen Niu, Ramin Halavati, Evan Peters, et al. Tensorflow quantum: A software framework for quantum machine learning. arXiv preprint arXiv:2003.02989, 2020.

[66] Cirq, https:/​/​quantumai.google/​cirq. URL: https:/​/​quantumai.google/​cirq.

[67] Openai gym leaderboard. URL: https:/​/​github.com/​openai/​gym/​wiki/​Leaderboard.

[68] Jin-Guo Liu and Lei Wang. Differentiable learning of quantum circuit born machines. Physical Review A, 98(6):062324, 2018. doi:10.1103/​PhysRevA.98.062324.

Cited by

[1] Maida Shahid and Muhammad Awais Hassan, "Introducing Quantum Variational Circuit for Efficient Management of Common Pool Resources", IEEE Access 11, 110862 (2023).

[2] Bhagya Rekha Sangisetti and Suresh Pabboju, "Deep fit_predic: a novel integrated pyramid dilation EfficientNet-B3 scheme for fitness prediction system", Computer Methods in Biomechanics and Biomedical Engineering 1 (2023).

[3] Li Ding and Lee Spector, "Multi-Objective Evolutionary Architecture Search for Parameterized Quantum Circuits", Entropy 25 1, 93 (2023).

[4] Ruijiang Zhang, Siwei Liu, Qing-Shan Jia, and Xu Wang, 2023 China Automation Congress (CAC) 2774 (2023) ISBN:979-8-3503-0375-9.

[5] Xianchao Zhu and Xiaokai Hou, "Quantum architecture search via truly proximal policy optimization", Scientific Reports 13 1, 5157 (2023).

[6] Eva Andrés, Manuel Pegalajar Cuéllar, and Gabriel Navarro, "On the Use of Quantum Reinforcement Learning in Energy-Efficiency Scenarios", Energies 15 16, 6034 (2022).

[7] Peigen Zeng, Ying He, F. Richard Yu, and Victor C.M. Leung, GLOBECOM 2023 - 2023 IEEE Global Communications Conference 01 (2023) ISBN:979-8-3503-1090-0.

[8] Manuel Schonberger, Maja Franz, Stefanie Scherzinger, and Wolfgang Mauerer, 2022 IEEE 19th International Conference on Software Architecture Companion (ICSA-C) 164 (2022) ISBN:978-1-6654-9493-9.

[9] Xianchao Zhu, Yashuang Mu, Xuetao Wang, and William Zhu, "Efficient relation extraction via quantum reinforcement learning", Complex & Intelligent Systems (2024).

[10] Naihua Ji, Rongyi Bao, Zhao Chen, Yiming Yu, and Hongyang Ma, "Hybrid Quantum Neural Network Image Anti-Noise Classification Model Combined with Error Mitigation", Applied Sciences 14 4, 1392 (2024).

[11] Namasi G. Sankar, Ankit Khandelwal, and M Girish Chandra, 2024 16th International Conference on COMmunication Systems & NETworkS (COMSNETS) 1058 (2024) ISBN:979-8-3503-8311-9.

[12] Sathish Kumar, Temitope Adeniyi, Ahmad Alomari, and Santanu Ganguly, 2023 IEEE International Conference on Quantum Computing and Engineering (QCE) 68 (2023) ISBN:979-8-3503-4323-6.

[13] Serge Rainjonneau, Igor Tokarev, Sergei Iudin, Saaketh Rayaprolu, Karan Pinto, Daria Lemtiuzhnikova, Miras Koblan, Egor Barashov, Mo Kordzanganeh, Markus Pflitsch, and Alexey Melnikov, "Quantum Algorithms Applied to Satellite Mission Planning for Earth Observation", IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 16, 7062 (2023).

[14] Louis Schatzki, Martín Larocca, Quynh T. Nguyen, Frédéric Sauvage, and M. Cerezo, "Theoretical guarantees for permutation-equivariant quantum neural networks", npj Quantum Information 10 1, 12 (2024).

[15] Maja Franz, Lucas Wolf, Maniraman Periyasamy, Christian Ufrecht, Daniel D. Scherer, Axel Plinge, Christopher Mutschler, and Wolfgang Mauerer, "Uncovering instabilities in variational-quantum deep Q-networks", Journal of the Franklin Institute 360 17, 13822 (2023).

[16] Bhaskara Narottama, Zina Mohamed, and Sonia Aïssa, "Quantum Machine Learning for Next-G Wireless Communications: Fundamentals and the Path Ahead", IEEE Open Journal of the Communications Society 4, 2204 (2023).

[17] Charles Moussa, Yash J. Patel, Vedran Dunjko, Thomas Bäck, and Jan N. van Rijn, "Hyperparameter importance and optimization of quantum neural networks across small datasets", Machine Learning 113 4, 1941 (2024).

[18] Agustin Silva, Omar Gustavo Zabaleta, and Constancio Miguel Arizmendi, "Maximizing Local Rewards on Multi-Agent Quantum Games through Gradient-Based Learning Strategies", Entropy 25 11, 1484 (2023).

[19] Yize Sun, Yunpu Ma, and Volker Tresp, 2023 IEEE International Conference on Quantum Computing and Engineering (QCE) 15 (2023) ISBN:979-8-3503-4323-6.

[20] M. Sohaib Alam, Noah F. Berthusen, and Peter P. Orth, "Quantum logic gate synthesis as a Markov decision process", npj Quantum Information 9 1, 108 (2023).

[21] Samuel Yen-Chi Chen, 2023 IEEE International Conference on Quantum Computing and Engineering (QCE) 31 (2023) ISBN:979-8-3503-4323-6.

[22] Andrea Skolik, Stefano Mangini, Thomas Bäck, Chiara Macchiavello, and Vedran Dunjko, "Robustness of quantum reinforcement learning under hardware errors", EPJ Quantum Technology 10 1, 8 (2023).

[23] Ruilin Liu, Sebastián V. Romero, Izaskun Oregi, Eneko Osaba, Esther Villar-Rodriguez, and Yue Ban, "Digital Quantum Simulation and Circuit Learning for the Generation of Coherent States", Entropy 24 11, 1529 (2022).

[24] Samuel Yen-Chi Chen, Proceedings of the 2023 International Workshop on Quantum Classical Cooperative 17 (2023) ISBN:9798400701627.

[25] Samuel Yen-Chi Chen, "Asynchronous training of quantum reinforcement learning", Procedia Computer Science 222, 321 (2023).

[26] M. P. Cuéllar, C. Cano, L. G. B. Ruiz, and L. Servadei, "Time series quantum classifiers with amplitude embedding", Quantum Machine Intelligence 5 2, 45 (2023).

[27] Samuel Yen-Chi Chen and Shinjae Yoo, Federated Learning 311 (2024) ISBN:9780443190377.

[28] Han Qi, Lei Wang, Hongsheng Zhu, Abdullah Gani, and Changqing Gong, "The barren plateaus of quantum neural networks: review, taxonomy and trends", Quantum Information Processing 22 12, 435 (2023).

[29] Shiqin Di, Jinchen Xu, Guoqiang Shu, Congcong Feng, Xiaodong Ding, and Zheng Shan, "Amplitude transformed quantum convolutional neural network", Applied Intelligence 53 18, 20863 (2023).

[30] Tailong Xiao, Jingzheng Huang, Hongjing Li, Jianping Fan, and Guihua Zeng, "Quantum generative adversarial imitation learning", New Journal of Physics 25 3, 033034 (2023).

[31] Sofiene Jerbi, Lukas J. Fiderer, Hendrik Poulsen Nautrup, Jonas M. Kübler, Hans J. Briegel, and Vedran Dunjko, "Quantum machine learning beyond kernel methods", Nature Communications 14 1, 517 (2023).

[32] A. Sannia, A. Giordano, N. Lo Gullo, C. Mastroianni, and F. Plastina, "A hybrid classical-quantum approach to speed-up Q-learning", Scientific Reports 13 1, 3913 (2023).

[33] Wenhan Yu and Jun Zhao, 2023 International Conference on Computer and Applications (ICCA) 1 (2023) ISBN:979-8-3503-0325-4.

[34] El Amine Cherrat, Iordanis Kerenidis, and Anupam Prakash, "Quantum reinforcement learning via policy iteration", Quantum Machine Intelligence 5 2, 30 (2023).

[35] James Chao, Ramiro Rodriguez, and Sean Crowe, Proceedings of the Companion Conference on Genetic and Evolutionary Computation 2179 (2023) ISBN:9798400701207.

[36] Eva Andrés, M. P. Cuéllar, and G. Navarro, "Efficient Dimensionality Reduction Strategies for Quantum Reinforcement Learning", IEEE Access 11, 104534 (2023).

[37] Asel Sagingalieva, Mo Kordzanganeh, Andrii Kurkin, Artem Melnikov, Daniil Kuhmistrov, Michael Perelshtein, Alexey Melnikov, Andrea Skolik, and David Von Dollen, "Hybrid quantum ResNet for car classification and its hyperparameter optimization", Quantum Machine Intelligence 5 2, 38 (2023).

[38] M. Cerezo, Guillaume Verdon, Hsin-Yuan Huang, Lukasz Cincio, and Patrick J. Coles, "Challenges and opportunities in quantum machine learning", Nature Computational Science 2 9, 567 (2022).

[39] Andrea Skolik, Michele Cattelan, Sheir Yarkoni, Thomas Bäck, and Vedran Dunjko, "Equivariant quantum circuits for learning on weighted graphs", npj Quantum Information 9 1, 47 (2023).

[40] Tobias Winker, Sven Groppe, Valter Uotila, Zhengtong Yan, Jiaheng Lu, Maja Franz, and Wolfgang Mauerer, Companion of the 2023 International Conference on Management of Data 45 (2023) ISBN:9781450395076.

[41] Tomoaki Kimura, Kodai Shiba, Chih-Chieh Chen, Masaru Sogabe, Katsuyoshi Sakamoto, and Tomah Sogabe, "Quantum circuit architectures via quantum observable Markov decision process planning", Journal of Physics Communications 6 7, 075006 (2022).

[42] Seyed Shakib Vedaie, Archismita Dalal, Eduardo J. Páez, and Barry C. Sanders, "Framework for learning and control in the classical and quantum domains", Annals of Physics 458, 169471 (2023).

[43] Charles Moussa, Jan N. van Rijn, Thomas Bäck, and Vedran Dunjko, Lecture Notes in Computer Science 13601, 32 (2022) ISBN:978-3-031-18839-8.

[44] Samuel Yen-Chi Chen, ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 1 (2023) ISBN:978-1-7281-6327-7.

[45] Yaofu Liu, Chang Xu, and Siyuan Jin, 2023 IEEE International Conference on Quantum Software (QSW) 83 (2023) ISBN:979-8-3503-0479-4.

[46] Anna Dawid, Julian Arnold, Borja Requena, Alexander Gresch, Marcin Płodzień, Kaelan Donatella, Kim A. Nicoli, Paolo Stornati, Rouven Koch, Miriam Büttner, Robert Okuła, Gorka Muñoz-Gil, Rodrigo A. Vargas-Hernández, Alba Cervera-Lierta, Juan Carrasquilla, Vedran Dunjko, Marylou Gabrié, Patrick Huembeli, Evert van Nieuwenburg, Filippo Vicentini, Lei Wang, Sebastian J. Wetzel, Giuseppe Carleo, Eliška Greplová, Roman Krems, Florian Marquardt, Michał Tomza, Maciej Lewenstein, and Alexandre Dauphin, "Modern applications of machine learning in quantum sciences", arXiv:2204.04198, (2022).

[47] Michael Broughton, Guillaume Verdon, Trevor McCourt, Antonio J. Martinez, Jae Hyeon Yoo, Sergei V. Isakov, Philip Massey, Ramin Halavati, Murphy Yuezhen Niu, Alexander Zlokapa, Evan Peters, Owen Lockwood, Andrea Skolik, Sofiene Jerbi, Vedran Dunjko, Martin Leib, Michael Streif, David Von Dollen, Hongxiang Chen, Shuxiang Cao, Roeland Wiersema, Hsin-Yuan Huang, Jarrod R. McClean, Ryan Babbush, Sergio Boixo, Dave Bacon, Alan K. Ho, Hartmut Neven, and Masoud Mohseni, "TensorFlow Quantum: A Software Framework for Quantum Machine Learning", arXiv:2003.02989, (2020).

[48] Quynh T. Nguyen, Louis Schatzki, Paolo Braccia, Michael Ragone, Patrick J. Coles, Frederic Sauvage, Martin Larocca, and M. Cerezo, "Theory for Equivariant Quantum Neural Networks", arXiv:2210.08566, (2022).

[49] Sofiene Jerbi, Casper Gyurik, Simon C. Marshall, Hans J. Briegel, and Vedran Dunjko, "Parametrized quantum policies for reinforcement learning", arXiv:2103.05577, (2021).

[50] En-Jui Kuo, Yao-Lung L. Fang, and Samuel Yen-Chi Chen, "Quantum Architecture Search via Deep Reinforcement Learning", arXiv:2104.07715, (2021).

[51] Samuel Yen-Chi Chen, Chih-Min Huang, Chia-Wei Hsing, Hsi-Sheng Goan, and Ying-Jer Kao, "Variational quantum reinforcement learning via evolutionary optimization", Machine Learning: Science and Technology 3 1, 015025 (2022).

[52] Jen-Yueh Hsiao, Yuxuan Du, Wei-Yin Chiang, Min-Hsiu Hsieh, and Hsi-Sheng Goan, "Unentangled quantum reinforcement learning agents in the OpenAI Gym", arXiv:2203.14348, (2022).

[53] Qingfeng Lan, "Variational Quantum Soft Actor-Critic", arXiv:2112.11921, (2021).

[54] Jun Qi, Chao-Han Huck Yang, and Pin-Yu Chen, "QTN-VQC: An End-to-End Learning framework for Quantum Neural Networks", arXiv:2110.03861, (2021).

[55] Esther Ye and Samuel Yen-Chi Chen, "Quantum Architecture Search via Continual Reinforcement Learning", arXiv:2112.05779, (2021).

[56] El Amine Cherrat, Iordanis Kerenidis, and Anupam Prakash, "Quantum Reinforcement Learning via Policy Iteration", arXiv:2203.01889, (2022).

[57] Christa Zoufal, "Generative Quantum Machine Learning", arXiv:2111.12738, (2021).

[58] M. Cerezo, Guillaume Verdon, Hsin-Yuan Huang, Lukasz Cincio, and Patrick J. Coles, "Challenges and Opportunities in Quantum Machine Learning", arXiv:2303.09491, (2023).

[59] Owen Lockwood, "An Empirical Review of Optimization Techniques for Quantum Variational Circuits", arXiv:2202.01389, (2022).

[60] Arjan Cornelissen and Sofiene Jerbi, "Quantum algorithms for multivariate Monte Carlo estimation", arXiv:2107.03410, (2021).

[61] Arjan Cornelissen, Yassine Hamoudi, and Sofiene Jerbi, "Near-Optimal Quantum Algorithms for Multivariate Mean Estimation", arXiv:2111.09787, (2021).

[62] Brian Coyle, "Machine learning applications for noisy intermediate-scale quantum computers", arXiv:2205.09414, (2022).

[63] Zhihao Cheng, Kaining Zhang, Li Shen, and Dacheng Tao, "Quantum Imitation Learning", arXiv:2304.02480, (2023).

[64] Simon Wiedemann, Daniel Hein, Steffen Udluft, and Christian Mendl, "Quantum Policy Iteration via Amplitude Estimation and Grover Search -- Towards Quantum Advantage for Reinforcement Learning", arXiv:2206.04741, (2022).

[65] Dániel Nagy, Zsolt Tabi, Péter Hága, Zsófia Kallus, and Zoltán Zimborás, "Photonic Quantum Policy Learning in OpenAI Gym", arXiv:2108.12926, (2021).

[66] Maja Franz, Lucas Wolf, Maniraman Periyasamy, Christian Ufrecht, Daniel D. Scherer, Axel Plinge, Christopher Mutschler, and Wolfgang Mauerer, "Uncovering Instabilities in Variational-Quantum Deep Q-Networks", arXiv:2202.05195, (2022).

[67] Maniraman Periyasamy, Nico Meyer, Christian Ufrecht, Daniel D. Scherer, Axel Plinge, and Christopher Mutschler, "Incremental Data-Uploading for Full-Quantum Classification", arXiv:2205.03057, (2022).

[68] Nancy Barraza, Gabriel Alvarado Barrios, Jie Peng, Lucas Lamata, Enrique Solano, and Francisco Albarrán-Arriagada, "Analog quantum approximate optimization algorithm", Quantum Science and Technology 7 4, 045035 (2022).

[69] Charles Moussa, Jan N. van Rijn, Thomas Bäck, and Vedran Dunjko, "Hyperparameter Importance of Quantum Neural Networks Across Small Datasets", arXiv:2206.09992, (2022).

[70] Manuel Schönberger, Maja Franz, Stefanie Scherzinger, and Wolfgang Mauerer, "Peel $\mid$ Pile? Cross-Framework Portability of Quantum Software", arXiv:2203.06289, (2022).

[71] Hannah Helgesen, Michael Felsberg, and Jan-Åke Larsson, "Certainty In, Certainty Out: REVQCs for Quantum Machine Learning", arXiv:2310.10629, (2023).

The above citations are from Crossref's cited-by service (last updated successfully 2024-04-15 08:30:50) and SAO/NASA ADS (last updated successfully 2024-04-15 08:30:51). The list may be incomplete as not all publishers provide suitable and complete citation data.