Deep Reinforcement Learning for Quantum State Preparation with Weak Nonlinear Measurements

Riccardo Porotti1,2, Antoine Essig3, Benjamin Huard3, and Florian Marquardt1,2

1Max Planck Institute for the Science of Light, Erlangen, Germany
2Department of Physics, Friedrich-Alexander Universität Erlangen-Nürnberg, Germany
3Univ Lyon, ENS de Lyon, CNRS, Laboratoire de Physique,F-69342 Lyon, France

Find this paper interesting or want to discuss? Scite or leave a comment on SciRate.


Quantum control has been of increasing interest in recent years, e.g. for tasks like state initialization and stabilization. Feedback-based strategies are particularly powerful, but also hard to find, due to the exponentially increased search space. Deep reinforcement learning holds great promise in this regard. It may provide new answers to difficult questions, such as whether nonlinear measurements can compensate for linear, constrained control. Here we show that reinforcement learning can successfully discover such feedback strategies, without prior knowledge. We illustrate this for state preparation in a cavity subject to quantum-non-demolition detection of photon number, with a simple linear drive as control. Fock states can be produced and stabilized at very high fidelity. It is even possible to reach superposition states, provided the measurement rates for different Fock states can be controlled as well.

Quantum control has been of great relevance in recent years, especially due to the spread of quantum computers. Dealing with feedback in quantum control (i.e. using measurements to steer the dynamics) is especially difficult since the control choices get exponentially large. The system studied here can be modelled as a cavity, that can be weakly measured to obtain partial information about each energy level. To prepare and stabilize quantum states in such a cavity, we use reinforcement learning (RL). RL is a branch of machine learning that deals with control problems. In an RL framework, the algorithm tries to maximize an objective function (in this case the fidelity) by interacting with the system via a trial-and-error process. In this work, RL manages to prepare complex superpositions of the Fock state in the cavity, with only very limited linear control. The RL agent also learns to stabilize quantum states against different forms of decay.

► BibTeX data

► References

[1] Navin Khaneja, Timo Reiss, Cindie Kehlet, Thomas Schulte-Herbrüggen, and Steffen J. Glaser. ``Optimal control of coupled spin dynamics: Design of NMR pulse sequences by gradient ascent algorithms''. Journal of Magnetic Resonance 172, 296–305 (2005).

[2] P. de Fouquieres, S. G. Schirmer, S. J. Glaser, and Ilya Kuprov. ``Second order gradient ascent pulse engineering''. Journal of Magnetic Resonance 212, 412–417 (2011).

[3] A. C. Doherty and K. Jacobs. ``Feedback control of quantum systems using continuous state estimation''. Phys. Rev. A 60, 2700–2711 (1999).

[4] Pavel Bushev, Daniel Rotter, Alex Wilson, François Dubin, Christoph Becher, Jürgen Eschner, Rainer Blatt, Viktor Steixner, Peter Rabl, and Peter Zoller. ``Feedback Cooling of a Single Trapped Ion''. Phys. Rev. Lett. 96, 043003 (2006).

[5] Howard M. Wiseman and Gerard J. Milburn. ``Quantum Measurement and Control''. Cambridge University Press. Cambridge (2009).

[6] G. G. Gillett, R. B. Dalton, B. P. Lanyon, M. P. Almeida, M. Barbieri, G. J. Pryde, J. L. O'Brien, K. J. Resch, S. D. Bartlett, and A. G. White. ``Experimental Feedback Control of Quantum Systems Using Weak Measurements''. Phys. Rev. Lett. 104, 080503 (2010).

[7] Clément Sayrin, Igor Dotsenko, Xingxing Zhou, Bruno Peaudecerf, Théo Rybarczyk, Sébastien Gleyzes, Pierre Rouchon, Mazyar Mirrahimi, Hadis Amini, Michel Brune, Jean-Michel Raimond, and Serge Haroche. ``Real-time quantum feedback prepares and stabilizes photon number states''. Nature 477, 73–77 (2011).

[8] P. Campagne-Ibarcq, E. Flurin, N. Roch, D. Darson, P. Morfin, M. Mirrahimi, M. H. Devoret, F. Mallet, and B. Huard. ``Persistent Control of a Superconducting Qubit by Stroboscopic Measurement Feedback''. Phys. Rev. X 3, 021008 (2013).

[9] Nissim Ofek, Andrei Petrenko, Reinier Heeres, Philip Reinhold, Zaki Leghtas, Brian Vlastakis, Yehan Liu, Luigi Frunzio, S. M. Girvin, L. Jiang, Mazyar Mirrahimi, M. H. Devoret, and R. J. Schoelkopf. ``Extending the lifetime of a quantum bit with error correction in superconducting circuits''. Nature 536, 441–445 (2016).

[10] Massimiliano Rossi, David Mason, Junxin Chen, Yeghishe Tsaturyan, and Albert Schliesser. ``Measurement-based quantum control of mechanical motion''. Nature 563, 53–58 (2018).

[11] Shay Hacohen-Gourgy and Leigh S. Martin. ``Continuous measurements for control of superconducting quantum circuits''. Advances in Physics: X 5, 1813626 (2020). arXiv:2009.07297.

[12] Alessio Fallani, Matteo A. C. Rossi, Dario Tamascelli, and Marco G. Genoni. ``Learning feedback control strategies for quantum metrology''. PRX Quantum 3, 020310 (2022).

[13] Richard S. Sutton and Andrew G. Barto. ``Reinforcement Learning, second edition: An Introduction''. MIT Press. (2018). url: http:/​/​​book/​the-book.html.

[14] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis. ``Human-level control through deep reinforcement learning''. Nature 518, 529–533 (2015).

[15] Tuomas Haarnoja, Sehoon Ha, Aurick Zhou, Jie Tan, George Tucker, and Sergey Levine. ``Learning to Walk via Deep Reinforcement Learning'' (2019). arXiv:1812.11103.

[16] Thomas Fösel, Petru Tighineanu, Talitha Weiss, and Florian Marquardt. ``Reinforcement Learning with Neural Networks for Quantum Feedback''. Phys. Rev. X 8, 031084 (2018).

[17] Chunlin Chen, Daoyi Dong, Han-Xiong Li, Jian Chu, and Tzyh-Jong Tarn. ``Fidelity-Based Probabilistic Q-Learning for Control of Quantum Systems''. IEEE Transactions on Neural Networks and Learning Systems 25, 920–933 (2014).

[18] Moritz August and José Miguel Hernández-Lobato. ``Taking Gradients Through Experiments: LSTMs and Memory Proximal Policy Optimization for Black-Box Quantum Control''. In Rio Yokota, Michèle Weiland, John Shalf, and Sadaf Alam, editors, High Performance Computing. Pages 591–613. Lecture Notes in Computer ScienceCham (2018). Springer International Publishing.

[19] Marin Bukov, Alexandre G. R. Day, Dries Sels, Phillip Weinberg, Anatoli Polkovnikov, and Pankaj Mehta. ``Reinforcement Learning in Different Phases of Quantum Control''. Phys. Rev. X 8, 031086 (2018). arXiv:1705.00565.

[20] Riccardo Porotti, Dario Tamascelli, Marcello Restelli, and Enrico Prati. ``Coherent transport of quantum states by deep reinforcement learning''. Commun Phys 2, 1–9 (2019).

[21] Murphy Yuezhen Niu, Sergio Boixo, Vadim N. Smelyanskiy, and Hartmut Neven. ``Universal quantum control through deep reinforcement learning''. npj Quantum Information 5, 1–8 (2019).

[22] Zheng An and D. L. Zhou. ``Deep reinforcement learning for quantum gate control''. EPL 126, 60002 (2019).

[23] Han Xu, Junning Li, Liqiang Liu, Yu Wang, Haidong Yuan, and Xin Wang. ``Generalizable control for quantum parameter estimation through reinforcement learning''. npj Quantum Inf 5, 1–8 (2019).

[24] Juan Miguel Arrazola, Thomas R. Bromley, Josh Izaac, Casey R. Myers, Kamil Brádler, and Nathan Killoran. ``Machine learning method for state preparation and gate synthesis on photonic quantum computers''. Quantum Sci. Technol. 4, 024004 (2019).

[25] L. O'Driscoll, R. Nichols, and P. A. Knott. ``A hybrid machine learning algorithm for designing quantum experiments''. Quantum Mach. Intell. 1, 5–15 (2019).

[26] Thomas Fösel, Stefan Krastanov, Florian Marquardt, and Liang Jiang. ``Efficient cavity control with SNAP gates'' (2020). arXiv:2004.14256.

[27] Mogens Dalgaard, Felix Motzoi, Jens Jakob Sørensen, and Jacob Sherson. ``Global optimization of quantum dynamics with AlphaZero deep exploration''. npj Quantum Inf 6, 6 (2020).

[28] Hailan Ma, Daoyi Dong, Steven X. Ding, and Chunlin Chen. ``Curriculum-based Deep Reinforcement Learning for Quantum Control'' (2021). arXiv:2012.15427.

[29] Zheng An, Hai-Jing Song, Qi-Kai He, and D. L. Zhou. ``Quantum optimal control of multilevel dissipative quantum systems with reinforcement learning''. Phys. Rev. A 103, 012404 (2021).

[30] Yuval Baum, Mirko Amico, Sean Howell, Michael Hush, Maggie Liuzzi, Pranav Mundada, Thomas Merkh, Andre R.R. Carvalho, and Michael J. Biercuk. ``Experimental Deep Reinforcement Learning for Error-Robust Gate-Set Design on a Superconducting Quantum Computer''. PRX Quantum 2, 040324 (2021).

[31] Thomas Fösel, Murphy Yuezhen Niu, Florian Marquardt, and Li Li. ``Quantum circuit optimization with deep reinforcement learning'' (2021). arXiv:2103.07585.

[32] E. Flurin, L. S. Martin, S. Hacohen-Gourgy, and I. Siddiqi. ``Using a Recurrent Neural Network to Reconstruct Quantum Dynamics of a Superconducting Qubit from Physical Observations''. Physical Review X 10 (2020).

[33] D. T. Lennon, H. Moon, L. C. Camenzind, Liuqi Yu, D. M. Zumbühl, G. a. D. Briggs, M. A. Osborne, E. A. Laird, and N. Ares. ``Efficiently measuring a quantum device using machine learning''. npj Quantum Information 5, 1–8 (2019).

[34] Kyunghoon Jung, M. H. Abobeih, Jiwon Yun, Gyeonghun Kim, Hyunseok Oh, Ang Henry, T. H. Taminiau, and Dohun Kim. ``Deep learning enhanced individual nuclear-spin detection''. npj Quantum Inf 7, 1–9 (2021).

[35] V Nguyen. ``Deep reinforcement learning for efficient measurement of quantum devices''. npj Quantum InformationPage 9 (2021).

[36] Alexander Hentschel and Barry C. Sanders. ``Machine Learning for Precise Quantum Measurement''. Phys. Rev. Lett. 104, 063603 (2010).

[37] M. Tiersch, E. J. Ganahl, and H. J. Briegel. ``Adaptive quantum computation in changing environments using projective simulation''. Sci Rep 5, 12874 (2015).

[38] Pantita Palittapongarnpim, Peter Wittek, Ehsan Zahedinejad, Shakib Vedaie, and Barry C. Sanders. ``Learning in quantum control: High-dimensional global optimization for noisy quantum dynamics''. Neurocomputing 268, 116–126 (2017).

[39] Jelena Mackeprang, Durga B. Rao Dasari, and Jörg Wrachtrup. ``A reinforcement learning approach for quantum state engineering''. Quantum Mach. Intell. 2, 5 (2020).

[40] Christian Sommer, Muhammad Asjad, and Claudiu Genes. ``Prospects of reinforcement learning for the simultaneous damping of many mechanical modes''. Sci Rep 10, 2623 (2020).

[41] Zhikang T. Wang, Yuto Ashida, and Masahito Ueda. ``Deep Reinforcement Learning Control of Quantum Cartpoles''. Phys. Rev. Lett. 125, 100401 (2020).

[42] Sangkha Borah, Bijita Sarma, Michael Kewming, Gerard J. Milburn, and Jason Twamley. ``Measurement-Based Feedback Quantum Control with Deep Reinforcement Learning for a Double-Well Nonlinear Potential''. Phys. Rev. Lett. 127, 190403 (2021).

[43] V. V. Sivak, A. Eickbusch, H. Liu, B. Royer, I. Tsioutsios, and M. H. Devoret. ``Model-Free Quantum Control with Reinforcement Learning''. Phys. Rev. X 12, 011059 (2022).

[44] Antoine Essig, Quentin Ficheux, Théau Peronnin, Nathanaël Cottet, Raphaël Lescanne, Alain Sarlette, Pierre Rouchon, Zaki Leghtas, and Benjamin Huard. ``Multiplexed Photon Number Measurement''. Phys. Rev. X 11, 031045 (2021).

[45] B. Peaudecerf, C. Sayrin, X. Zhou, T. Rybarczyk, S. Gleyzes, I. Dotsenko, J. M. Raimond, M. Brune, and S. Haroche. ``Quantum feedback experiments stabilizing Fock states of light in a cavity''. Phys. Rev. A 87, 042320 (2013).

[46] X. Zhou, I. Dotsenko, B. Peaudecerf, T. Rybarczyk, C. Sayrin, S. Gleyzes, J. M. Raimond, M. Brune, and S. Haroche. ``Field Locked to a Fock State by Quantum Feedback with Single Photon Corrections''. Phys. Rev. Lett. 108, 243602 (2012).

[47] Jacob C. Curtis, Connor T. Hann, Salvatore S. Elder, Christopher S. Wang, Luigi Frunzio, Liang Jiang, and Robert J. Schoelkopf. ``Single-shot number-resolved detection of microwave photons with error mitigation''. Phys. Rev. A 103, 023705 (2021).

[48] Christine Guerlin, Julien Bernu, Samuel Deléglise, Clément Sayrin, Sébastien Gleyzes, Stefan Kuhr, Michel Brune, Jean-Michel Raimond, and Serge Haroche. ``Progressive field-state collapse and quantum non-demolition photon counting''. Nature 448, 889–893 (2007).

[49] B. R. Johnson, M. D. Reed, A. A. Houck, D. I. Schuster, Lev S. Bishop, E. Ginossar, J. M. Gambetta, L. DiCarlo, L. Frunzio, S. M. Girvin, and R. J. Schoelkopf. ``Quantum non-demolition detection of single microwave photons in a circuit''. Nature Phys 6, 663–667 (2010).

[50] B. Peaudecerf, T. Rybarczyk, S. Gerlich, S. Gleyzes, J. M. Raimond, S. Haroche, I. Dotsenko, and M. Brune. ``Adaptive Quantum Nondemolition Measurement of a Photon Number''. Phys. Rev. Lett. 112, 080401 (2014).

[51] Crispin Gardiner and Peter Zoller. ``Quantum Noise: A Handbook of Markovian and Non-Markovian Quantum Stochastic Methods with Applications to Quantum Optics''. Springer Series in Synergetics. Springer-Verlag. Berlin Heidelberg (2004). Third edition. url:​book/​9783540223016.

[52] John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. ``Proximal Policy Optimization Algorithms'' (2017). arXiv:1707.06347.

[53] John Schulman, Sergey Levine, Philipp Moritz, Michael I. Jordan, and Pieter Abbeel. ``Trust Region Policy Optimization'' (2017). arXiv:1502.05477.

[54] Ashley Hill, Antonin Raffin, Maximilian Ernestus, Adam Gleave, Anssi Kanervisto, Rene Traore, Prafulla Dhariwal, Christopher Hesse, Oleg Klimov, Alex Nichol, Matthias Plappert, Alec Radford, John Schulman, Szymon Sidor, and Yuhuai Wu. ``Stable baselines''. url:​hill-a/​stable-baselines.

[55] Weizhou Cai, Yuwei Ma, Weiting Wang, Chang-Ling Zou, and Luyan Sun. ``Bosonic quantum error correction codes in superconducting quantum circuits''. Fundamental Research 1, 50–67 (2021).

[56] F. A. M. de Oliveira, M. S. Kim, P. L. Knight, and V. Buek. ``Properties of displaced number states''. Physical Review A 41, 2645–2652 (1990).

[57] Michael Martin Nieto. ``Displaced and Squeezed Number States''. Physics Letters A 229, 135–143 (1997). arXiv:quant-ph/​9612050.

Cited by

[1] Ettore Canonici, Stefano Martina, Riccardo Mengoni, Daniele Ottaviani, and Filippo Caruso, "Machine Learning based Noise Characterization and Correction on Neutral Atoms NISQ Devices", Advanced Quantum Technologies 7 1, 2300192 (2024).

[2] Matija Medvidović and Dries Sels, "Variational Quantum Dynamics of Two-Dimensional Rotor Models", PRX Quantum 4 4, 040302 (2023).

[3] Mario Krenn, Jonas Landgraf, Thomas Foesel, and Florian Marquardt, "Artificial intelligence and machine learning for quantum technologies", Physical Review A 107 1, 010101 (2023).

[4] Paolo A Erdman, Frank Noé, and A Editor, "Model-free optimization of power/efficiency tradeoffs in quantum thermal machines using reinforcement learning", PNAS Nexus 2 8, pgad248 (2023).

[5] Björn Annby-Andersson, Faraj Bakhshinezhad, Debankur Bhattacharyya, Guilherme De Sousa, Christopher Jarzynski, Peter Samuelsson, and Patrick P. Potts, "Quantum Fokker-Planck Master Equation for Continuous Feedback Control", Physical Review Letters 129 5, 050401 (2022).

[6] Jiahao Huang, Min Zhuang, Jungeng Zhou, Yi Shen, and Chaohong Lee, "Quantum Metrology Assisted by Machine Learning", Advanced Quantum Technologies 2300329 (2024).

[7] Federico Belliardo, Fabio Zoratti, and Vittorio Giovannetti, "Applications of model-aware reinforcement learning in Bayesian quantum metrology", Physical Review A 109 6, 062609 (2024).

[8] Shouliang Hu, Chunlin Chen, and Daoyi Dong, 2022 13th Asian Control Conference (ASCC) 2367 (2022) ISBN:978-89-93215-23-6.

[9] Friederike Metz and Marin Bukov, "Self-correcting quantum many-body control using reinforcement learning with tensor networks", Nature Machine Intelligence 5 7, 780 (2023).

[10] Sacha Greenfield, Leigh Martin, Felix Motzoi, K. Birgitta Whaley, Justin Dressel, and Eli M. Levenson-Falk, "Stabilizing two-qubit entanglement with dynamically decoupled active feedback", Physical Review Applied 21 2, 024022 (2024).

[11] Valentin Gebhart, Raffaele Santagati, Antonio Andrea Gentile, Erik M. Gauger, David Craig, Natalia Ares, Leonardo Banchi, Florian Marquardt, Luca Pezzè, and Cristian Bonato, "Learning quantum systems", Nature Reviews Physics 5 3, 141 (2023).

[12] Wenjie Liu, Jing Xu, and Bosi Wang, "A Quantum States Preparation Method Based on Difference-Driven Reinforcement Learning", SPIN 13 03, 2350013 (2023).

[13] Arthur Perret and Yves Bérubé-Lauzière, "Preparation of cavity-Fock-state superpositions by reinforcement learning exploiting measurement backaction", Physical Review A 109 2, 022609 (2024).

[14] Sangkha Borah and Bijita Sarma, "No-Collapse Accurate Quantum Feedback Control via Conditional State Tomography", Physical Review Letters 131 21, 210803 (2023).

[15] Chunfeng Wu, Chunfang Sun, Jing-Ling Chen, and X.X. Yi, "Decoherence-Protected Implementation of Quantum Gates", Physical Review Applied 19 3, 034069 (2023).

[16] Kai Meinerz, Simon Trebst, Mark Rudner, and Evert van Nieuwenburg, "The quantum cartpole: A benchmark environment for non-linear reinforcement learning", SciPost Physics Core 7 2, 026 (2024).

[17] David A. Herrera-Martí, "Policy Gradient Approach to Compilation of Variational Quantum Circuits", Quantum 6, 797 (2022).

[18] Kevin Reuer, Jonas Landgraf, Thomas Fösel, James O’Sullivan, Liberto Beltrán, Abdulkadir Akin, Graham J. Norris, Ants Remm, Michael Kerschbaum, Jean-Claude Besse, Florian Marquardt, Andreas Wallraff, and Christopher Eichler, "Realizing a deep reinforcement learning agent for real-time quantum feedback", Nature Communications 14 1, 7138 (2023).

[19] Francesco Albarelli and Marco G. Genoni, "A pedagogical introduction to continuously monitored quantum systems and measurement-based feedback", Physics Letters A 494, 129260 (2024).

[20] Frédéric Sauvage and Florian Mintert, "Optimal Control of Families of Quantum Gates", Physical Review Letters 129 5, 050507 (2022).

[21] Haixu Yu and Xudong Zhao, "Deep Reinforcement Learning With Reward Design for Quantum Control", IEEE Transactions on Artificial Intelligence 5 3, 1087 (2024).

[22] Jia-Hao Cao, Feng Chen, Qi Liu, Tian-Wei Mao, Wen-Xin Xu, Ling-Na Wu, and Li You, "Detection of Entangled States Supported by Reinforcement Learning", Physical Review Letters 131 7, 073201 (2023).

[23] Paolo A. Erdman, Alberto Rolandi, Paolo Abiuso, Martí Perarnau-Llobet, and Frank Noé, "Pareto-optimal cycles for power, efficiency and fluctuations of quantum heat engines using reinforcement learning", Physical Review Research 5 2, L022017 (2023).

[24] Haixu Yu and Xudong Zhao, "Event-Based Deep Reinforcement Learning for Quantum Control", IEEE Transactions on Emerging Topics in Computational Intelligence 8 1, 548 (2024).

[25] Anna Dawid, Julian Arnold, Borja Requena, Alexander Gresch, Marcin Płodzień, Kaelan Donatella, Kim A. Nicoli, Paolo Stornati, Rouven Koch, Miriam Büttner, Robert Okuła, Gorka Muñoz-Gil, Rodrigo A. Vargas-Hernández, Alba Cervera-Lierta, Juan Carrasquilla, Vedran Dunjko, Marylou Gabrié, Patrick Huembeli, Evert van Nieuwenburg, Filippo Vicentini, Lei Wang, Sebastian J. Wetzel, Giuseppe Carleo, Eliška Greplová, Roman Krems, Florian Marquardt, Michał Tomza, Maciej Lewenstein, and Alexandre Dauphin, "Modern applications of machine learning in quantum sciences", arXiv:2204.04198, (2022).

[26] Riccardo Porotti, Vittorio Peano, and Florian Marquardt, "Gradient-Ascent Pulse Engineering with Feedback", PRX Quantum 4 3, 030305 (2023).

[27] Luigi Giannelli, Pierpaolo Sgroi, Jonathon Brown, Gheorghe Sorin Paraoanu, Mauro Paternostro, Elisabetta Paladino, and Giuseppe Falci, "A tutorial on optimal control and reinforcement learning methods for quantum technologies", Physics Letters A 434, 128054 (2022).

[28] Alessio Fallani, Matteo A. C. Rossi, Dario Tamascelli, and Marco G. Genoni, "Learning Feedback Control Strategies for Quantum Metrology", PRX Quantum 3 2, 020310 (2022).

[29] Remmy Zen, Jan Olle, Luis Colmenarez, Matteo Puviani, Markus Müller, and Florian Marquardt, "Quantum Circuit Discovery for Fault-Tolerant Logical State Preparation with Reinforcement Learning", arXiv:2402.17761, (2024).

[30] Paolo Andrea Erdman and Frank Noé, "Model-free optimization of power/efficiency tradeoffs in quantum thermal machines using reinforcement learning", arXiv:2204.04785, (2022).

The above citations are from Crossref's cited-by service (last updated successfully 2024-06-22 00:52:07) and SAO/NASA ADS (last updated successfully 2024-06-22 00:52:07). The list may be incomplete as not all publishers provide suitable and complete citation data.