Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing

Sarah Brandsen; Kevin D. Stubbs; Henry D. Pfister

doi:10.22331/q-2022-01-26-633

Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing

Sarah Brandsen¹, Kevin D. Stubbs², and Henry D. Pfister^2,3

¹Department of Physics, Duke University, Durham, North Carolina 27708, USA.
²Department of Mathematics, Duke University, Durham, North Carolina 27708, USA
³Department of Electrical and Computer Engineering, Duke University, Durham, North Carolina 27708, USA

Published:	2022-01-26, volume 6, page 633
Eprint:	arXiv:2010.08588v3
Doi:	https://doi.org/10.22331/q-2022-01-26-633
Citation:	Quantum 6, 633 (2022).

Find this paper interesting or want to discuss? Scite or leave a comment on SciRate.

Abstract

Reinforcement learning with neural networks (RLNN) has recently demonstrated great promise for many problems, including some problems in quantum information theory. In this work, we apply RLNN to quantum hypothesis testing and determine the optimal measurement strategy for distinguishing between multiple quantum states $\{ \rho_{j} \}$ while minimizing the error probability. In the case where the candidate states correspond to a quantum system with many qubit subsystems, implementing the optimal measurement on the entire system is experimentally infeasible.

We use RLNN to find locally-adaptive measurement strategies that are experimentally feasible, where only one quantum subsystem is measured in each round. We provide numerical results which demonstrate that RLNN successfully finds the optimal local approach, even for candidate states up to 20 subsystems. We additionally demonstrate that the RLNN strategy meets or exceeds the success probability for a modified locally greedy approach in each random trial.

While the use of RLNN is highly successful for designing adaptive local measurement strategies, in general a significant gap can exist between the success probability of the optimal locally-adaptive measurement strategy and the optimal collective measurement. We build on previous work to provide a set of necessary and sufficient conditions for collective protocols to strictly outperform locally adaptive protocols. We also provide a new example which, to our knowledge, is the simplest known state set exhibiting a significant gap between local and collective protocols. This result raises interesting new questions about the gap between theoretically optimal measurement strategies and practically implementable measurement strategies.

Popular summary

Reinforcement learning with neural networks (RLNN) has recently demonstrated great promise for many problems, including some problems in quantum information theory. In this work, we apply RLNN to quantum hypothesis testing, where one is given a set of multiple quantum states $\{\rho_{j} \}$ and needs to maximize the probability of guessing the correct state by finding the optimal quantum measurement.

In general, the quantum states may correspond to a large quantum system composed of multiple smaller subsystems and the optimal measurement may require simultaneously measuring all of the quantum subsystems. However, simultaneous measurements on a large number of quantum systems are typically not experimentally feasible to implement. The main result of this work is using RLNN to develop experimentally practical, locally-adaptive methods for quantum hypothesis testing where only one quantum subsystem is measured in each round. We provide numerical results which demonstrate that RLNN successfully finds the optimal local approach, even for candidate states up to 20 subsystems. Furthermore, we demonstrate that these optimal locally adaptive strategies are robust under noise.

► BibTeX data

@article{Brandsen2022reinforcement,
  doi = {10.22331/q-2022-01-26-633},
  url = {https://doi.org/10.22331/q-2022-01-26-633},
  title = {Reinforcement {L}earning with {N}eural {N}etworks for {Q}uantum {M}ultiple {H}ypothesis {T}esting},
  author = {Brandsen, Sarah and Stubbs, Kevin D. and Pfister, Henry D.},
  journal = {{Quantum}},
  issn = {2521-327X},
  publisher = {{Verein zur F{\"{o}}rderung des Open Access Publizierens in den Quantenwissenschaften}},
  volume = {6},
  pages = {633},
  month = jan,
  year = {2022}
}

► References

[1] A. Ferdinand, M. DiMario, and F. Becerra, ``Multi-state discrimination below the quantum noise limit at the single-photon level,'' npj Quantum Information, vol. 3, 12 2017. https://doi.org/10.1038/s41534-017-0042-2.
https://doi.org/10.1038/s41534-017-0042-2

[2] H. Krovi, S. Guha, Z. Dutton, and M. P. da Silva, ``Optimal measurements for symmetric quantum states with applications to optical communication,'' Physical Review A, vol. 92, Dec 2015. https://doi.org/10.1103/PhysRevA.92.062333.
https://doi.org/10.1103/PhysRevA.92.062333

[3] N. Rengaswamy and H. D. Pfister, ``Quantum advantage in classical communications via belief-propagation with quantum messages,'' 2020. https://doi.org/10.1038/s41534-021-00422-1.
https://doi.org/10.1038/s41534-021-00422-1

[4] A. Assalini, N. Dalla Pozza, and G. Pierobon, ``Revisiting the Dolinar receiver through multiple-copy state discrimination theory,'' Phys. Rev. A, vol. 84, p. 022342, Aug 2011. https://doi.org/10.1103/PhysRevA.84.022342.
https://doi.org/10.1103/PhysRevA.84.022342

[5] A. S. Holevo, ``Bounds for the quantity of information transmitted by a quantum communication channel,'' Problemy Peredachi Informatsii, vol. 9, no. 3, pp. 3–11, 1973.

[6] H. Yuen, R. Kennedy, and M. Lax, ``Optimum testing of multiple hypotheses in quantum detection theory,'' IEEE Transactions on Information Theory, vol. 21, no. 2, pp. 125–134, 1975. https://doi.org/10.1109/TIT.1975.1055351.
https://doi.org/10.1109/TIT.1975.1055351

[7] A. H. Kiilerich and K. Mølmer, ``Multistate and multihypothesis discrimination with open quantum systems,'' Physical Review A, vol. 97, May 2018. https://doi.org/10.1103/PhysRevA.97.052113.
https://doi.org/10.1103/PhysRevA.97.052113

[8] R. Koenig, R. Renner, and C. Schaffner, ``The operational meaning of min- and max-entropy,'' IEEE Transactions on Information Theory, vol. 55, p. 4337–4347, Sep 2009. https://doi.org/10.1109/TIT.2009.2025545.
https://doi.org/10.1109/TIT.2009.2025545

[9] R. Bellman, ``The theory of dynamic programming,'' Bull. Amer. Math. Soc., vol. 60, pp. 503–515, 11 1954. https://doi.org/10.1090/S0002-9904-1954-09848-8.
https://doi.org/10.1090/S0002-9904-1954-09848-8

[10] S. Brandsen, M. Lian, K. D. Stubbs, N. Rengaswamy, and H. D. Pfister, ``Adaptive procedures for discrimination between arbitrary tensor-product quantum states,'' 2019. https://arxiv.org/abs/1912.05087.
arXiv:1912.05087

[11] G. Tesauro, ``Practical issues in temporal difference learning,'' Mach. Learn., vol. 8, p. 257–277, May 1992. https://doi.org/10.1007/978-1-4615-3618-5_3.
https://doi.org/10.1007/978-1-4615-3618-5_3

[12] G. J. Gordon, ``Stable fitted reinforcement learning,'' in Proceedings of the 8th International Conference on Neural Information Processing Systems, NIPS’95, (Cambridge, MA, USA), p. 1052–1058, MIT Press, 1995.

[13] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, ``Playing Atari with deep reinforcement learning,'' 2013. https://arxiv.org/abs/1312.5602.
arXiv:1312.5602

[14] V. Mnih, K. Kavukcuoglu, D. Silver, A. Rusu, J. Veness, M. Bellemare, A. Graves, M. Riedmiller, A. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis, ``Human-level control through deep reinforcement learning,'' Nature, vol. 518, pp. 529–33, 02 2015. https://doi.org/10.1038/nature14236.
https://doi.org/10.1038/nature14236

[15] T. Fösel, P. Tighineanu, T. Weiss, and F. Marquardt, ``Reinforcement learning with neural networks for quantum feedback,'' Phys. Rev. X, vol. 8, p. 031084, Sep 2018. https://doi.org/10.1103/PhysRevX.8.031084.
https://doi.org/10.1103/PhysRevX.8.031084

[16] G. D. Paparo, V. Dunjko, A. Makmal, M. A. Martin-Delgado, and H. J. Briegel, ``Quantum speedup for active learning agents,'' Phys. Rev. X, vol. 4, no. 9, 2014. https://doi.org/10.1103/PhysRevX.4.031002.
https://doi.org/10.1103/PhysRevX.4.031002

[17] M. Bukov, ``Reinforcement learning for autonomous preparation of floquet-engineered states: Inverting the quantum kapitza oscillator,'' Phys. Rev. B, vol. 98, p. 224305, Dec 2018. https://doi.org/10.1103/PhysRevB.98.224305.
https://doi.org/10.1103/PhysRevB.98.224305

[18] A. A. Melnikov, H. Poulsen Nautrup, M. Krenn, V. Dunjko, M. Tiersch, A. Zeilinger, and H. J. Briegel, ``Active learning machine learns to create new quantum experiments,'' Proceedings of the National Academy of Sciences, vol. 115, no. 6, pp. 1221–1226, 2018. https://doi.org/10.1073/pnas.1714936115.
https://doi.org/10.1073/pnas.1714936115

[19] J. Mackeprang, D. Dasari, and J. Wrachtrup, ``A reinforcement learning approach for quantum state engineering,'' Quantum Mach. Intell. 2, 5, 2020. https://doi.org/10.1007/s42484-020-00016-8.
https://doi.org/10.1007/s42484-020-00016-8

[20] A. A. Melnikov, P. Sekatski, and N. Sangouard, ``Setting up experimental bell tests with reinforcement learning,'' Phys. Rev. Lett., vol. 125, p. 160401, Oct 2020. https://doi.org/10.1103/PhysRevLett.125.160401.
https://doi.org/10.1103/PhysRevLett.125.160401

[21] J. Wallnöfer, A. A. Melnikov, W. Dür, and H. J. Briegel, ``Machine learning for long-distance quantum communication,'' PRX Quantum, vol. 1, p. 010301, Sep 2020. https://doi.org/10.1103/PRXQuantum.1.010301.
https://doi.org/10.1103/PRXQuantum.1.010301

[22] R. Sweke, M. S. Kesselring, E. P. L. van Nieuwenburg, and J. Eisert, ``Reinforcement learning decoders for fault-tolerant quantum computation,'' Machine Learning: Science and Technology, vol. 2, p. 025005, jan 2021. https://doi.org/10.1088/2632-2153/abc609.
https://doi.org/10.1088/2632-2153/abc609

[23] F. Schäfer, M. Kloc, C. Bruder, and N. Lörch, ``A differentiable programming method for quantum control,'' Machine Learning: Science and Technology, vol. 1, p. 035009, Aug 2020. https://doi.org/10.1088/2632-2153/ab9802.
https://doi.org/10.1088/2632-2153/ab9802

[24] X.-M. Zhang, Z. Wei, R. Asad, X.-C. Yang, and X. Wang, ``When does reinforcement learning stand out in quantum control? a comparative study on state preparation,'' npj Quantum Inf 5, 85, 2019. https://doi.org/10.1038/s41534-019-0201-8.
https://doi.org/10.1038/s41534-019-0201-8

[25] R. Sweke, M. S. Kesselring, E. P. L. van Nieuwenburg, and J. Eisert, ``Reinforcement learning decoders for fault-tolerant quantum computation,'' Machine Learning: Science and Technology, vol. 2, p. 025005, Jan 2021. https://doi.org/10.1088/2632-2153/abc609.
https://doi.org/10.1088/2632-2153/abc609

[26] H. Xu, J. Li, L. Liu, Y. Wang, H. Yuan, and X. Wang, ``Generalizable control for quantum parameter estimation through reinforcement learning,'' npj Quantum Inf 5, 82, 2019. https://doi.org/10.1038/s41534-019-0198-z.
https://doi.org/10.1038/s41534-019-0198-z

[27] P. Sgroi, G. M. Palma, and M. Paternostro, ``Reinforcement learning approach to nonequilibrium quantum thermodynamics,'' Phys. Rev. Lett., vol. 126, p. 020601, Jan 2021. https://doi.org/10.1103/PhysRevLett.126.020601.
https://doi.org/10.1103/PhysRevLett.126.020601

[28] P. Palittpongarnpim, P. Wittek, and B. C. Sanders, ``Single-shot adaptive measurement for quantum-enhanced metrology,'' Quantum Communications and Quantum Imaging XIV, Sep 2016. https://doi.org/10.1117/12.2237355.
https://doi.org/10.1117/12.2237355

[29] A. Hentschel and B. C. Sanders, ``Machine learning for precise quantum measurement,'' Physical Review Letters, vol. 104, Feb 2010. https://doi.org/10.1103/PhysRevLett.104.063603.
https://doi.org/10.1103/PhysRevLett.104.063603

[30] P. Palittapongarnpim, P. Wittek, E. Zahedinejad, S. Vedaie, and B. C. Sanders, ``Learning in quantum control: High-dimensional global optimization for noisy quantum dynamics,'' Neurocomputing, vol. 268, p. 116–126, Dec 2017. https://doi.org/10.1016/j.neucom.2016.12.087.
https://doi.org/10.1016/j.neucom.2016.12.087

[31] P. Palittapongarnpim and B. C. Sanders, ``Robustness of quantum-enhanced adaptive phase estimation,'' Physical Review A, vol. 100, Jul 2019. https://doi.org/10.1103/PhysRevA.100.012106.
https://doi.org/10.1103/PhysRevA.100.012106

[32] Y. Eldar, A. Megretski, and G. Verghese, ``Designing optimal quantum detectors via semidefinite programming,'' IEEE Transactions on Information Theory, vol. 49, p. 1007–1012, Apr 2003. https://doi.org/10.1109/TIT.2003.809510.
https://doi.org/10.1109/TIT.2003.809510

[33] A. Acín, E. Bagan, M. Baig, L. Masanes, and R. Muñoz Tapia, ``Multiple-copy two-state discrimination with individual measurements,'' Phys. Rev. A, vol. 71, p. 032338, 2005. https://doi.org/10.1103/PhysRevA.71.032338.
https://doi.org/10.1103/PhysRevA.71.032338

[34] C. H. Bennett, D. P. DiVincenzo, C. A. Fuchs, T. Mor, E. Rains, P. W. Shor, J. A. Smolin, and W. K. Wootters, ``Quantum nonlocality without entanglement,'' Physical Review A, vol. 59, p. 1070–1091, Feb 1999. https://doi.org/10.1103/PhysRevA.59.1070.
https://doi.org/10.1103/PhysRevA.59.1070

[35] S. Massar and S. Popescu, ``Optimal extraction of information from finite quantum ensembles,'' Phys. Rev. Lett., vol. 74, pp. 1259–1263, Feb 1995. https://doi.org/10.1142/9789812563071_0023.
https://doi.org/10.1142/9789812563071_0023

[36] K. Flatt, S.M. Barnett, and S. Croke, ``Multiple-copy state discrimination of noisy qubits'', Phys. Rev. A, vol. 100, pp. 032122, Sep 2019. https://doi.org/10.1103/PhysRevA.100.032122.
https://doi.org/10.1103/PhysRevA.100.032122

[37] B.L. Higgins, A.C. Doherty, S.D. Bartlett, G.J. Pryde, and H.M. Wiseman, ``Multiple-copy state discrimination: Thinking globally, acting locally'', Phys. Rev. A, vol. 81, p. 052314, 2011. https://doi.org/10.1103/PhysRevA.83.052314.
https://doi.org/10.1103/PhysRevA.83.052314

[38] G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, and W. Zaremba, ``OpenAI gym,'' 2016. https://arxiv.org/abs/1606.01540.
arXiv:1606.01540

[39] J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, ``Proximal policy optimization algorithms,'' 2017. https://arxiv.org/abs/1707.06347.
arXiv:1707.06347

[40] R. Liaw, E. Liang, R. Nishihara, P. Moritz, J. E. Gonzalez, and I. Stoica, ``Tune: A research platform for distributed model selection and training,'' arXiv:1807.05118, 2018.
arXiv:1807.05118

[41] E. Liang, R. Liaw, P. Moritz, R. Nishihara, R. Fox, K. Goldberg, J. E. Gonzalez, M. I. Jordan, and I. Stoica, ``Rllib: Abstractions for distributed reinforcement learning,'' 2017. https://arxiv.org/abs/1712.09381.
arXiv:1712.09381

[42] M. Sasaki, K. Kato, M. Izutsu, and O. Hirota, ``Quantum channels showing superadditivity in classical capacity,'' Phys. Rev. A, vol. 58, pp. 146–158, Jul 1998. https://doi.org/10.1103/PhysRevA.58.146.
https://doi.org/10.1103/PhysRevA.58.146

[43] S. Virmani, M. Sacchi, M. Plenio, and D. Markham, ``Optimal local discrimination of two multipartite pure states,'' Physics Letters A, vol. 288, p. 62–68, Sep 2001. https://doi.org/10.1016/S0375-9601(01)00484-4.
https://doi.org/10.1016/S0375-9601(01)00484-4

[44] S. Croke, S. Barnett, and G. Weir, ``Optimal sequential measurements for bipartite state discrimination,'' Physical Review A, vol 95, no 5, 2017. https://doi.org/10.1103/PhysRevA.95.052308.
https://doi.org/10.1103/PhysRevA.95.052308

[45] G. Weir, C. Hughes, S. M. Barnett, and S. Croke, ``Optimal measurement strategies for the trine states with arbitrary prior probabilities,'' 2018. https://arxiv.org/abs/1803.03590.
arXiv:1803.03590

[46] M. Ban, ``Optimum measurements for discrimination among symmetric quantum states and parameter estimation,'' International Journal of Theoretical Physics, vol. 36, no. 6, pp. 1269–1288, 1997. https://doi.org/10.1007/BF02435921.
https://doi.org/10.1007/BF02435921

Cited by

On Crossref's cited-by service no data on citing works was found (last attempt 2024-09-07 19:39:50). On SAO/NASA ADS no data on citing works was found (last attempt 2024-09-07 19:39:50).

This Paper is published in Quantum under the Creative Commons Attribution 4.0 International (CC BY 4.0) license. Copyright remains with the original copyright holders such as the authors or their institutions.