Encoding strongly-correlated many-boson wavefunctions on a photonic quantum computer: application to the attractive Bose-Hubbard model

Variational quantum algorithms (VQA) are considered as some of the most promising methods to determine the properties of complex strongly correlated quantum many-body systems, especially from the perspective of devices available in the near term. In this context, the development of efficient quantum circuit ansatze to encode a many-body wavefunction is one of the keys for the success of a VQA. Great efforts have been invested to study the potential of current quantum devices to encode the eigenstates of fermionic systems, but little is known about the encoding of bosonic systems. In this work, we investigate the encoding of the ground state of the (simple but rich) attractive Bose-Hubbard model using a Continuous-Variable (CV) photonic-based quantum circuit. We introduce two different ansatz architectures and demonstrate that the proposed continuous variable quantum circuits can efficiently encode (with a fidelity higher than 99%) the strongly correlated many-boson wavefunction with just a few layers, in all many-body regimes and for different number of bosons and initial states. Beyond the study of the suitability of the ansatz to approximate the ground states of many-boson systems, we also perform initial evaluations of the use of the ansatz in a variational quantum eigensolver algorithm to find it through energy minimization. To this end we also introduce a scheme to measure the Hamiltonian energy in an experimental system, and study the effect of sampling noise.

Variational quantum algorithms (VQA) are considered as some of the most promising methods to determine the properties of complex strongly correlated quantum many-body systems, especially from the perspective of devices available in the near term. In this context, the development of efficient quantum circuit ansatze to encode a many-body wavefunction is one of the keys for the success of a VQA. Great efforts have been invested to study the potential of current quantum devices to encode the eigenstates of fermionic systems, but little is known about the encoding of bosonic systems. In this work, we investigate the encoding of the ground state of the (simple but rich) attractive Bose-Hubbard model using a Continuous-Variable (CV) photonic-based quantum circuit. We introduce two different ansatz architectures and demonstrate that the proposed continuous variable quantum circuits can accurately encode (with a fidelity higher than 99%) the strongly correlated many-boson wavefunction with just a few layers, in all many-body regimes and for different number of bosons and initial states. Beyond the study of the suitability of the ansatz to approximate Saad Yalouz: yalouzsaad@gmail.com Bruno Senjean: bruno.senjean@umontpellier.fr Filippo Miatto: filippo@xanadu.ai Vedran Dunjko: v.dunjko@liacs.leidenuniv.nl the ground states of many-boson systems, we also perform initial evaluations of the use of the ansatz in a variational quantum eigensolver algorithm to find it through energy minimization. To this end we also introduce a scheme to measure the Hamiltonian energy in an experimental system, and study the effect of sampling noise.

Introduction
In recent years, great efforts have been deployed to study the potential of noisy intermediate-scale quantum (NISQ) computers to tackle tasks that are hard to treat for classical computers. One of the most emblematic research lines in the field is the development of near-term algorithms to determine the properties of complex quantum many-body systems. Variational quantum algorithms (VQA) have shown to be very efficient and promising for encoding complex wavefunctions of various kinds of systems. Applications of these algorithms range from molecular systems in quantum chemistry [1][2][3][4][5][6][7][8][9][10][11], interacting-spins and electrons models from condensed matter [12][13][14] or vibrational Hamiltonians [15], to cite but a few. In practice, the success of a VQA in encoding a many-body wavefunction depends on the properties of the associated quantum circuit, or "ansatz", that is used. Ideally, to satisfy the needs and limitations of the NISQ era, an ansatz should be expressive (i.e. for example, having the ability to generate accurate ground-state wavefunc-tions or energies), resource efficient (i.e. involving as few gates/layers as possible) and trainable (i.e. inducing optimization landscapes amenable to classical optimization algorithms). In the case of qubit-based quantum circuits, the above properties of ansatze for VQAs have been studied extensively: examples are the unitary coupledcluster ansatz [16] introduced for electronic structure problems, adaptive methods designed to produce expressive ansatze at lower complexity cost [17,18], and different strategies to solve condensed matter models with strong quantum correlation [12,19]. The great majority of these works has been essentially devoted to the case of many-fermion systems.
Motivated by the development of VQAs, we investigate in this work the use of digital photonicbased continuous variable (CV) quantum computers [73][74][75][76] to encode strongly correlated many-boson wavefunctions. In practice, CV devices are digital quantum computers that can manipulate information by entangling and measuring quantum states of light using various types of quantum gates. The bosonic nature of such photonic devices leads to a one-to-one mapping between the photonic modes of a CV computer and the bosonic modes of other many-boson sys-tems. Such a mapping was first employed in Ref. [48] to generate molecular vibronic spectra, where the N photons in M optical modes can simulate N molecular vibrational quanta in M vibrational modes. Bosonic modes of CV devices were also used to efficiently encode the field degrees of freedom in a wavelet basis in quantum field theories [77,78]. For variational algorithms, the CV setting is then particularly wellsuited to explore new types of quantum circuit ansatz to encode strongly correlated many-boson wavefunctions. The rich complexity of such type of wavefunctions can be qualitatively reproduced by studying the attractive Bose-Hubbard (BH) model. The ground state of this model has attracted a lot of attention recently due to its rich transitional behavior that evolves between the so-called Schrödinger cat-like states to superfluid regimes [79][80][81][82][83][84][85]. Furthermore, the BH model has been successfully employed to describe the physics of ultracold atoms and molecules in optical lattices [86][87][88][89][90][91], quantum cooling protocols [44], the evolution of vibrational bosons in α-helices proteins [92,93] and the exotic phases of helium-4 [94,95].
In this paper, we study the attractive BH model as an interesting system and focus on two aspects of the aforementioned critical properties of ansatze for VQA applications: the expressibility and the resource efficiency to encode the ground-state wavefunction of the system on CV devices. Note that, in analogy with the qubit cases, the implementation costs of a photonic circuit depend on numerous factors. Obviously, they depend on depth and total gate number, but not all gates are equally costly (e.g. linear optical elements are much cheaper than gates involving nonlinear terms), nor they introduce an equal amount of noise and losses. We are thus facing a multiobjective optimization problem as, for instance, using cheaper gates may come at the expense of greater depth to achieve the same theoretical capacity to represent ground states accurately. Further, in the light of near-term implementations, it makes additional sense to study ansatz tailored to the actual experiments performed in the labs or to those available in the near future. To elucidate the issues of trade-offs between various resource requirements and expressibility, we propose and analyze two CV quantum circuit architectures tailored to near-future experimental set-ups that can make use of single-mode non-linear phase shifts such as the Kerr gate: one designed to minimize the total gate count and the other designed to minimize the total circuit depth. By performing VQAs with the state-infidelity as a cost function, we show that the many-boson ground state of the BH model can be successfully encoded on both CV quantum circuits with a high fidelity and a relatively small number of gates and parameters. In practice, the state-infidelity cannot be computed for large systems (as it requires the knowledge of the exact wavefunction). A realistic experiment on a true quantum device would require the ability to measure on a CV circuit the ground-state energy instead. Thus, we propose a photon-counting protocol to measure the expectation value of the BH Hamiltonian in order to simulate a variational quantum eigensolver (VQE) algorithm, as it would be done in a true experiment, with and without sampling noise.
The paper is organized as follows. After a brief introduction to the attractive BH Hamiltonian and the exact diagonalization method in sections 2.1 and 2.2, we illustrate the exotic structural transition of the BH ground state on three small-sized networks in section 2.3. After introducing the context of VQAs in section 3.1, we present the architectures of our proposed ansatze in section 3.2. Numerical investigations of their encoding ability are performed in section 4. We extend our study to ideal and realistic simulations of a VQE algorithm in section 5. Finally, conclusions and perspectives are provided in section 6.

Bose-Hubbard model
In this first section, we present a general introduction of the attractive BH model. Our objective is to give an overview of the properties of the model and to illustrate the rich behaviors that arise from the increase of the many-body interaction. We will focus on three small-sized BH models, namely the BH dimer and the three-and four-site periodic chains. These systems will be used later in this paper as test-bed for the quantum ansatze designed herein to encode many-boson wavefunctions on photonic devices.

Hamiltonian and regimes of interaction
The attractive BH model describes the physics of N B bosons hopping between N S sites of a network with local attractive many-body interactions. The associated Hamiltonian operator readŝ where J > 0 is the hopping parameter describing the delocalization of the bosons throughout the connected nodes of the BH network, and U > 0 is the many-body interaction term corresponding to the local boson-boson interaction that occurs when at least two particles occupy the same site.
In this work, J is the energy unit (i.e. we set J ≡ 1 in every simulation).
To characterize the competition between the many-body interactions and the hopping terms in the system, a dimensionless parameter is commonly introduced. Typically, Λ 1 represents a regime of weak interaction (the so-called "Super-fluid regime") in which the physics of the system is governed by the hopping terms making the bosons able to efficiently delocalize over the network. The case Λ ∼ 1 corresponds to an intermediate quantum regime where hoppings and local many-particle interactions compete together and where phase transitions usually occur. The last case, Λ 1, corresponds to the regime where the many-body interaction dominates (the so-called "Fock regime"). In the case of attractive interactions, the resulting state tends to localize the bosons on local sites of the system (giving rise to macroscopic-cat states as illustrated later in this work).

Ground state reference: exact diagonalization
To investigate the exotic phases encoded in the attractive BH model, we compute the ground state of the HamiltonianĤ exactly by solving the time-independent Schrödinger equation where |Ψ 0 is the ground state of the system with the associated energy E 0 . We solve Eq. (3) by 2  3  4  5  8  16   2  3  4  5  6  9  17  3  6  10  15  21  45  153  4  10  20  35  56  165  969  8  36  120 330 792 6435 245157 which represents the number of ways of distributing N B bosons on N S sites. See Tab. 1, for an overview of the scaling of the DĤ as a function of N S and N B .
While exact diagonalization provides access to the exact solution of the problem, the exponential scaling of the Fock space dimension with respect to the number of sites and bosons limits the use of ED to very small systems in practice.

Illustration of the regimes of interaction: exact diagonalization on small-sized systems
As the many-body interaction increases, the ground state of the attractive BH model exhibits rich transitional behaviors. Such transitions have already been extensively discussed in several recent works (see for example [82,83,85]). Here, we briefly illustrate some of these exotic phases in three systems (that will be treated later using quantum ansatze developed on a CV device), namely the BH dimer, the three-site and the foursite periodic chain. To do so, we employ the Inverse Participation Ratio (IPR) [98][99][100][101] and the von Neumann entropy [102][103][104] as two indicators to follow the expansion of |Ψ 0 in the configuration basis when Λ increases. Applied to the Fock state basis, the IPR estimates the number of different bosonic configurations forming |Ψ 0 by linear combination, and reads where |n 1 , . . . , n N S is a Fock state with n q bosons on site q (and q n q = N B ). Within this definition, a state |Ψ 0 localized on a single state of the Fock-state basis has an IPR of 1. By contrast, the IPR of a state uniformly delocalized over all the Fock states is equal to DĤ. The von Neumann entropy is a measure used to quantify the entanglement (non-local quantum correlation) existing between different sub-parts of a quantum system. In many-body systems, it quantifies the degree of entanglement and correlation arising between the subsystem p and all the other modes (also commonly called "bi-partite entanglement"). It reads where λ k are the eigenvalues of the reduced density matrix of mode p, defined by taking the partial trace of the fullsystem density operatorρ over the Fock-state basis of all the other modes in the systems,σ p = Tr q =p [ρ] (note that in our case, the full density operator isρ = |Ψ 0 Ψ 0 | as one focuses on the ground state). For the sake of simplicity, we will only focus on a single-site von-Neumann entropy as defined in Eq. (6).
In Figure 1, we show the evolution of the ground state properties of the BH dimer as a function of the interaction parameter Λ (for different fixed number of bosons N B ). When Λ ∼ 0, the system is in a superfluid phase: every bosons propagate independently on the dimer and do no interact together. The independent-boson picture leads to a product-state solution for the ground state as |Ψ 0 = , 0 (with |0, 0 being the vacuum state of the dimer). Then, increasing Λ leads to a progressive restructuring of the ground state as readily seen by the increasing IPR. This shows a spreading of the state over the whole Fock basis that is also accompanied by an increasing single-site von-Neumann entropy. Both quantities then reach a maximum value for Λ ≈ 3. The large variations of these two measures are not without significance and reveal the appearance of quantum correlations and entanglement in the system. Indeed, in the region 2 ≤ Λ ≤ 4 the shape of the ground state gets more complex as it involves a larger number of bosonic configurations (actually almost all the Fock states are involved), and locally the bosonic modes get strongly entangled. Finally, as Λ increases the structure of the ground state loses in complexity and progressively tends to the well-known "cat-like" form [82,83] representing a superposition of two macroscopically occupied states (i.e. |Ψ 0 = (|N B , 0 + |0, N B )/ √ 2). The IPR reaches a value of 2, thus indicating the im- portance of these two fully localized bosonic configurations. The entanglement entropy S of the modes decreases as less bosonic configurations get correlated. To better picture the full restructuring process occurring during this exotic quantum transition, we represent the decomposition of the BH dimer's ground state in the Fock basis as a function of Λ in the lower panel in Fig. 1.
Turning to larger systems, similar exotic quantum transitions can be observed. In figure 2 we see that the IPR and the single-site von-Neumann entropy also reach their maximal value around Λ ≈ 5 for the three-and the four-site periodic chains. In this regime, the complexity of the ground state significantly grows with the total number of sites N S but also with the total number of bosons N B . As an illustration, there are around IPR max ≈ 42 bosonic configurations involved in the structure of the ground state at the maximum value of the IPR (Λ ≈ 4.8) for the three-site BH model with N B = 8. This number gets even higher for the four-site model with N B = 5 and Λ ≈ 4 with a total number of IPR max ≈ 55 bosonic configurations. When Λ goes beyond this region of high complexity, we retrieve the so-called cat-states as indicated by the number of macroscopically-occupied states (IPR = 3 and 4 for the three and foursite systems, respectively, e.g. for the three-site model we have

Photonic-gate-based ansatze
In the following, we introduce the different theoretical ingredients to encode the BH ground state on a photonic-gate-based device.

Variational Quantum Algorithms
For the study of many-body systems in the NISQ era, VQAs represent state-of-the-art methods that have been proficiently used to determine ground (and excited) states' properties of various systems (see for example Refs. [12,[16][17][18][19] and the recent review in Ref. [105]). The protocol of a VQA usually follows the steps: i) map the problem onto the quantum device, ii) use a quantum circuit as an "ansatz" to generate a trial state that encodes the wavefunction of the many-body system, iii) evaluate a given cost function by repeated measurements, iv) optimize the quantum gates parameters of the ansatz circuit to minimize the cost function governed by a variational principle. In practice, one applies a unitary transfor-mationÛ ( θ) with tunable gate parameters (formally represented by the vector θ) to transform an initial state |Ψ ini into a trial state Often |Ψ ini is chosen based on some physical motivations which depends on the targeted problem (e.g. such as the Hartree-Fock state in electronic structure problems). If one seeks the ground state of a system, the cost function is usually the energy of the trial state defined as where E 0 is the exact ground-state energy which represents a natural lower bound for the energy of the trial state |Ψ( θ) . In a real VQA experiment, the expectation value E( θ) is estimated by statistical sampling. The energy estimator will have an uncertainty based on the number of state preparations and measurements used. This energy is then passed to an external classical optimization routine which provides a new set of parameters θ used to update the quantum circuit. The whole optimization process is repeated until global convergence is reached (i.e. until E( θ) is minimized).
In the NISQ era, the overall performance of a VQA in encoding a many-body wavefunction is highly dependent on the properties of the ansatz that is considered (as mentioned in the introduction: expressibility, resource efficiency and trainability). In our study of CV ansatz, our attention turns towards the expressibility and resource efficiency, while the investigation of the trainability is left for future work. However, these properties are not independent, and the final performance should include the specification of the optimization as it may heavily influence the expressibility and resource efficiency of the VQA to achieve a given precision of the targeted state or energy. For this reason, studying all three properties is very difficult.
In the following, we introduce two different ansatze architectures and we apply them to the three BH systems introduced before. The different architectures address (to some extent) the question of expressibility and resource efficiency. For this, we characterize the capability of the associated circuit unitariesÛ ( θ) (see Eq. (8)) to accurately map different initial states |Ψ ini to the exact ground state |Ψ 0 of the three systems. Note that only a few works have tried to investigate the expressibility of ansatze for the case of many-body systems (always with a special emphasis for qubit devices and many-electron systems, e.g. the study of the unitary coupledcluster-ansatz in [106]). As a results, a lot of attention has been directed to many-fermion systems (see for example [12,[17][18][19]72,107,108]), but little work has been done on quantum ansatze to encode many-boson wavefunctions (for a notable exception, see the implementation of the unitary vibrational-coupled-cluster ansatz introduced in Ref. [15]).

Photonic quantum circuits as ansatze for many-boson wavefunction
In contrast to most of the studies focusing on qubit devices, we tackle the problem of manybosons systems using CV circuits. The choice of CV quantum computers is rather intuitive as there exists a straightforward one-to-one correspondence between the modes of a bosonic system and the photonic modes of such devices. In the case of the attractive BH system, each photonic mode p of the quantum circuit can be used to encode the bosonic population of the associated site p, as illustrated in Fig. 3. Starting from this advantageous property, we will introduce two quantum ansatze based on CV photonic gates to encode many-boson wavefunctions for VQAs.

Utilized CV quantum gates
Concerning the fundamental properties of both ansatze, we know that the BH Hamiltonian in Eq. (1) is boson-number-preserving by construction (i.e. N B is constant). Thus, in order to preserve this property, we choose for both ansatze to consider only three types of photon-numberpreserving gates, namely: beam-splitters, rotation gates and Kerr gates. First, the beamsplitter gate is a two-mode gate that affects two inputs modes simultaneously (labeled by p and q) with an unitary defined as where θ and φ are respectively the transmittivity and phase angle of the gate. The action of the beam-splitter's unitary on the associated ladder Beamsplitter transmittivity (rad.) operators (a p /a † p and a q /a † q ) is given by To have a better illustration of the effects of the beam-splitter gate on a many-photon wavefunction, figure 4 shows the evolution of the density of probability of a two-mode state B(θ, 0) |Ψ ini (with |Ψ ini = |8, 0 or |4, 4 ) in the Fock basis as a function of the transmittivity parameter θ. As shown in this figure, the main effect of the beamsplitter is to delocalize the photons from a mode to another one creating various patterns in the Fock basis. This strongly depends on the initial state considered. The phase rotation gate acts on a Fock state as R p (θ) |n p = e iθnp |n p with the unitary operator Finally, the Kerr gate is a non-Gaussian singlemode gate whose unitary reads For a given modes p, this leads to K p (θ) |n p = e iθn 2 p |n p , thus generating a non-linear phase term that depends on the squared number of bosons n 2 p = (a † p a p ) 2 present in the mode p. Although the implementation of Kerr gates as an optical component is a true challenge on current devices, the incessant development of new techniques and non-linear materials tends to suggest that reliable ways of implementing these gates will be available in the near future [109][110][111]. For the purpose of our theoretical work, we assumed the possibility to implement these gates without any practical constraints. This assumption was motivated by our analysis which revealed the key role played by the Kerr interaction in the performance of our ansatze (as it allows us to go beyond Gaussian transformations).
Note that the choice of the three types of gates presented here is not only motivated by their capacity to preserve a same number of bosons N B , but also by their physical action on a manyboson wavefunctions. Indeed, the role of a BS gate is essentially to explore the space of manyboson configurations for two connected modes. On another note, the addition of Kerr (or rotation) gates in a circuit after a BS gate makes it possible to non-linearly affect the phase of each many-bosons configurations of the wavefunction (see Appendix B for an illustration of this with the minimal ansatz). These behaviors were the major motivations for the design of the ansatze introduced in the next section, that can therefore be regarded as physically-motivated as well as hardware-efficient.

Minimal beam-splitter-Kerr ansatz
Let us now introduce the first ansatz developed to encode the ground state of the attractive BH model: the so-called minimal beam-splitter-Kerr (BS-Kerr) ansatz. This circuit architecture was designed to produce a high fidelity encoding with as few quantum gates and parameters to optimize as possible, based on numerical investigations. Considering these constraints, the most "efficient" architecture we found (in terms of the total number of gates required to encode the ground state with sufficient accuracy) resulted in layers of beam-splitter and Kerr gates as shown in Fig. 5 for two consecutive layers. Note that the phase angle parameters of the beam-splitters are always set to φ = 0 in this ansatz. Indeed, we did not see any improvement when considering the latter in our numerical simulations (not shown). Discarding these angles allows to reduce the complexity of the classical optimization with no impact on the final accuracy.
As readily seen in Fig. 5, each layer is defined as a series of N S −1 beam-splitter gates in stairs (up or down) followed by a series of N S Kerr gates applied on each mode, thus leading to a gate count of N L (2N S − 1) (and the same number of parameters to optimize) where N L is the number of layers. From a layer to another one, we alternate the staircase structure to ensure that each mode can have the opportunity to exchange photons (via the beam-splitters) with all the other modes in the circuit.
Note that removing the Kerr gates from the ansatz leads to very poor results (not shown), showing that a passive Gaussian transformation is insufficient to encode the ground state of the attractive BH model despite we considered non-Gaussian inputs.

Interferometer-Kerr ansatz
The BS-Kerr ansatz was designed to encode the ground state of the attractive BH model with a minimal number of gates and parameters, which is advantageous for numerical simulations. However, in experimental realizations one may want to minimize the number of Kerr layers. Therefore, we considered an alternative ansatz com- At first sight, this ansatz appears less efficient than the BS-Kerr ansatz in terms of number of gates and parameters per layer. However, we can expect the additional expressibility brought about by the complete interferometers to lead to a reduction in the number of layers required to encode the ground state. In other words, the interferometer-Kerr ansatz could lead to a reduction in the total number of Kerr gates at the expense of additional beam-splitter and rotation gates, which are however much simpler and cheaper. This hypothesis is numerically investigated in Sec. 4.2.
The complexity of the ansatz can be reduced by getting rid of the rotation gates (physically irrelevant for most applications [112]) and/or setting the phase angles of all beam-splitters to φ = 0 (and keeping only the transmittivity angles) as already done for the BS-Kerr ansatz.

Numerical investigation of the ansatze capabilities
In this section, we present a numerical investigation of the expressibility (and resource efficiency) for the two ansatze introduced previously applied to the BH dimer, the three-site and the four-site periodic chains. The simulations of the photonic circuits were realized using the Python package Strawberryfields [113]. For the optimization of the quantum circuits parameters, the L-BFGS-B method was used as encoded in the optimParallel package [114]. The circuit's parameters were always initialized with a small random amplitude drawn in [−0.05, +0.05], and the maximum number of iterations was set to 20000. An ED code has been implemented in Python to compute the exact ground state by exact diagonalization. Further we also provide a sample code with the implementations of all the components of our method in [115] for interested readers.

Capacity of the BS-Kerr ansatz
To demonstrate and characterize the capacity of the different ansatze to encode the ground state of BH networks, we estimated the fidelity of the trial state with respect to the exact ground state of the system |Ψ 0 , Assuming that the classical optimization process does not end up in local minima, an ansatz with a strong encoding capability will return a fidelity F close to unity. We numerically check this property by optimizing the quantum circuit's parameters to minimize a complementary measure that is the infidelity which is naturally lower-bounded by zero. Note that no mention has been made so far about the nature of the initial state |Ψ ini , for which we considered two choices: a single-mode and a twomode states, detailed later in section 4.1.2.

Illustration of the infidelity optimisation
As an introduction of our study, in this first subsection we illustrate the evolution of the variational minimization of the infidelity by the mean of representative single runs of optimization. We gather in Fig. 7 several examples of optimization runs showing the minimization of the measure I as a function of the number of function evaluations called in the L-BFGS-B algorithm (as provided by the optimParallel package [114]), for dif- ferent values of Λ to span the different correlation regimes. Each row of Fig. 7 (from top to bottom) shows results for the dimer, the three and four-site BH model respectively, for N B = 2 (left panels) and N B = 4 (right panels) bosons. N L = 5 layers of the BS-Kerr ansatze were considered for all optimization runs, whatever the size of the system. As a remark, note that the optimization runs presented here are essentially illustrative. The shape of the optimization curves may actually differ from a run to another as small initial random values of circuit parameters are considered in practice. We checked the effects of these fluctuations by realizing multiple optimizations with random initial parameters and we noticed negligible changes in the final results obtained, hence the tendencies shown in Fig. 7 can be regarded as sufficiently general.
As readily seen in Fig. 7, every optimization curve crosses the limit I = 1% (defined as the marker of good expressibility for the ansatze) for all values of Λ but Λ = 10 on the last panel. The left panels (N B = 2) manifest very low converged infidelities, and thus a very accurate encoding of the ground state of the system. The number of function evaluations required to fulfill the condition I = 1% (and to reach final convergence) ranges from around 10 to 100 in these cases. Conversely, increasing the number of bosons in the systems leads to an increase of the number of function evaluations, as illustrated on the right panels (N B = 4). In this case, the number of evaluations required to satisfy the condition I = 1% ranges from around 20 to 300. To reach final convergence (i.e. when the selected optimizer cannot minimize the cost function anymore) the number of evaluations also increases, ranging from 50 to 2000 approximately (all systems considered). The number of function evaluations increases with the number of bosons N B and sites N S , i.e. with the size of the Fock space DĤ (as shown in table 1) that is related to the complexity of the encoding problem. As a remark, note that in practice the final number of function evaluations N f in f eval differs from the final number of iterations N f in iter realized by the L-BFGS-B algorithm, as one iteration of the algorithm requires multiple function evaluations. We choose an upper limit of 20000 iterations to ensure convergence of the algorithm whatever the case considered. And as illustrated in Fig. 7, this limit is never reached as N f in iter ≤ N f in f eval 20000 (a feature that we also checked in many other test simulations). In Fig. 7, we see that a circuit with a depth of N L = 5 layers of the BS-Kerr ansatz is sufficient to ensure an accurate encoding of all ex-amples considered, except for the strongly interacting case of the largest system N S = 4, N B = 4 and Λ = 10. In this case, the converged infidelity is I ≈ 7% (i.e. F ≈ 93%). A much better fidelity can be reached by adding more layers to the quantum circuit, thus enhancing the expressibility of the ansatz. Typically, changing N L from 5 to 8 (not shown here), makes all the examples below the limit I = 1%.
In spite of these promising results, one can still suspect the presence of barren plateaus that appear for large systems in VQAs when random initial parameters are used. For qubit-based devices, barren plateaus denote the exponentially vanishing cost function gradients with the number of qubits [116][117][118][119][120][121]. Note that the exact conditions for the appearance of barren plateaus on qubit-devices remains unclear, as several recent works have shown that they can also arise when physically-inspired [122,123] and beyond 2-design ansatze [124] are considered. It is an open question of how and when barren plateaus may appear in CV ansatze, although recent studies indicate that CV systems may exhibit barren plateaus where the cost gradients vanish exponentially with the number of system modes (instead of the number of qubits) [125,126]. In this work, we did not see any optimization problem that could indicate the presence of such phenomenon, but it remains unclear for larger systems. Fortunately, due to the analogy between the barren plateaus phenomenon on discrete and continuous variable systems [125], one can expect that strategies used to bypass the barren plateaus on qubit-based devices [122,127,128] are also transferable to CV systems. Such an involved study is out of the scope of the manuscript and is left for future work.
In the next sections, we will present the results of our numerical study of the expressibility (and the resource efficiency) of the ansatze as a function of the number of bosons N B and sites N S . Our approach was inspired by Ref. [12]. Namely, we realize a series of simulations where we progressively increase the number of layers N L in the quantum circuits and perform each time one optimization run of infidelity. The choice of single runs was motivated by many numerical simulations which shown that very similar results are always obtained in terms of encoding capacity and resource efficiency. In practice, the more the number of layers N L increases, the more the expressibility of the ansatz improves. We continue to increase the number of layers N L until the infidelity of the ground state fulfill the condition I ≤ 1% (i.e. F ≥ 99%).

BH model with N S = 2 sites
Focusing first on the BS-Kerr ansatz for the BH dimer, the minimum number of layers detected to reach F ≥ 99% for N B = 1 to N B = 8 is shown in Fig. 8 for Λ values corresponding to different regimes of correlation (as marked by the vertical red-dashed lines in Fig. 1). Two different initial states were considered: a single-mode one given by |Ψ ini = |N B , 0 and a two-mode one given by . As a first general remark, Fig. 8 shows that, with only a few layers of circuit, a fidelity higher than 99% is always reached whatever the case considered (i.e. the correlation strength Λ, the number of bosons N B and the choice of initial state). These numerical results give a first demonstration that short depth CV circuits can accurately encode many-boson wavefunctions even in the strongly correlated regime, at least in the case of system sizes we have considered. Note that to ensure an accurate encoding of the wavefunction, the larger N B is, the larger the number of layers N L needs to be. This is expected as the complexity of the ground state grows with the size of the Fock space DĤ, which is linear in N B for the BH dimer (as shown in Tab. 1). Interestingly, the trend of the curves ensuring an efficient encoding is globally sub-linear in N B whatever the initial states considered. Roughly, we see here that choosing a number of layers such as N L ≈ 0.7 × N B seems to guarantee a very high fidelity in all cases considered herein. The maximum number of Kerr gates involved here reaches a maximum of N Kerr = 10 (as this number scales like N Kerr = N S N L ).
On another note, an interesting fact arises in the weakly correlated case Λ = 0.01 when the single-mode initial state is used (dark blue curve in the upper panel). In this case, a single layer of the BS-Kerr ansatz is sufficient to reach a fidelity higher than 99%, even when the number of bosons increases. This property is true only for the single-mode initial state. We rational- ize this result in Appendix A, where we show that, starting from such a state, an exact encoding (F = 100%) of the BH ground state in the non-interacting (superfluid) regime (Λ → 0) can be realized using only beam-splitters (this demonstration is realized for the three networks considered in our study). Note that even if no Kerr gates are required in this particular limit, an exact classical simulation of Fock states propagating in a passive interferometer has exponential complexity, as shown by seminal papers on the computational complexity of linear optical quantum computing [129,130]. Indeed, even if the interferometer itself can be fully described by a 2N -dimensional orthogonal matrix, (which is tractable), the action of an interferometer on an arbitrary input state requires a Fock space description of the gate (as opposed to a Gaussian input state, not considered here).
As an addition, in Fig. 9 we illustrate the evolution of the structural properties of a trial state (built with the BS-Kerr ansatz) as a function of Λ. In this case, we focus on a system with N B = 8 bosons and we use a quantum circuit of N L = 6 layers with |Ψ ini = |4, 4 as an initial state. The results obtained are compared to the exact ones generated via ED. On the upper panel of Fig. 9, we show the evolution of the IPR and the entanglement entropy S of the trial state (red dots) compared to exact results (black curve) as a function of Λ. Note that each of the trial states obtained in the simulations have a fidelity F ≥ 99%. The lower panel shows the evolution of the probability of occupation in the Fock basis for the trial state (green bars) compared to the exact ground state (black bars) for four representative values of interaction parameter Λ = 0.01, 3, 5 and 10. The results presented here highlight the structural transformation of the dimer ground state as Λ increases, starting from the superfluid regime (Λ = 0.01) and going through the maximum spreading structure (Λ = 3 for a maximal IPR value) to finally reach the famous "cat-like" superposition (with Λ = 5 to 10 and the IPR tending to 2). For all values of Λ considered, the IPR, the entanglement entropy S and the probability of occupation of the trial states are in very good agreement with the exact results, thus showing that the expressibility of our ansatz is strong enough to reproduce the exotic features of the exact ground state in different correlation regimes. Similar illustrations of the BH ground state restructuring will be given in the following of the paper for the three and four-site BH system. (approximately 6 times larger than the number of configurations on the dimer for the same number of bosons). In spite of this, we observed that the BS-Kerr ansatz performs very well as shown in Fig. 10. In this case, we retrieve some differences between the results obtained with the two initial states. The single-mode initial state seems to perform better in term of resource efficiency as the latter requires globally less layers to encode the ground state with a fidelity higher than 99%. Typically, we see in Fig. 10 that the number of layers detected increases up to N L = 8 for a single-mode initial state against a maximum of N L = 10 layers for the two-mode one. The associated number of Kerr gate being 24 and 30 respectively. Still, the results obtained here reveal the good expressibility of the minimal ansatz which is able to accurately transform both different initial states into very good approximations of the exact ground state. As readily seen in Fig. 10, choosing a circuit depth with N L ∼ 1.25 × N B guaranties an efficient encoding of the ground state in all cases considered here. As a last remark for the three-site system, note again that only one layer is required for Λ = 0.01 when this initial state is used (dark blue curve). We refer the interested reader to Appendix. A where we give a mathematical demonstration of this feature.
The complexity grows even stronger for the four-site BH network. We follow the exact same strategy than for the previous models by considering |Ψ ini = |N B , 0, 0, 0 as the single-mode initial state and |Ψ ini = | N B 2 , 0, N B 2 , 0 as the twomode one (n.b. for odd total number of bosons we took | N B +1 2 , 0, N B +1 2 − 1, 0 instead), as well as Λ values corresponding to different regimes of correlation marked by the vertical red-dashed lines in the right panel Fig. 2. For this system, the dimension of the Fock space explodes very quickly and fixing N B = 5 already leads to a number of bosonic configurations of DĤ = 56 that is larger than the size of the Fock space of the three-site model with N B = 8. In our simulations, we chose to limit ourselves to a maximal number of bosons of N B = 5 due to the numerical complexity of the classical simulation of our VQA. The performance of the BS-Kerr ansatz for N B = 1 to 5 is shown in Fig. 11. In the full range of correlation regimes, the photonic circuit is again able to encode the ground state with high fidelity for a circuit depth of N L ≈ 2N B . Here again we see that the single-mode initial state seems to perform better compared to the two-mode case. The associated maximum number of layers being N L = 8 for the latter (with 32 Kerr gates) compared to N L = 10 for a two-mode initial state (with 40 Kerr gates). We also retrieve on Fig. 11 the case of an exact encoding with one layer for the initial single-mode state for weakly correlated regime Λ = 0.01.
As an addition, Figs. 12 and 13 illustrate the structural transition of the BH ground state for the three and four-site BH networks, respectively. Similarly to Fig. 9, the upper part of the plots show the evolution of the IPR and the entanglement entropy of trial states (red dots) compared to exact ground state (black curves) as a function of Λ. Note that, all the trial states generated here present a fidelity of F ≥ 99%. The lower panels show the density of probability of the trial states (green bar) compared to the exact ground state (black bars). Just as in the BH dimer, the IPR, the entanglement entropy S and the density of probability of the resulting trial states are in very good agreement with the exact ground state.
As a summary of the results presented so far, a remarkable property is that very high fidelity encoding can be performed with the minimal ansatz for the tree system, whatever the regimes of many-body interactions. Globally, our results suggest that, for a fixed system size, the number of layers required to ensure an encoding of F ≥ 99% scales linearly with the number of bosons. Furthermore, our simulations also demonstrate that high fidelity may be obtained even if we consider different initial states |Ψ ini that are simple Fock states (and not complex quantum superpositions of many Fock states). Hence, the minimal ansatz is expressive enough to accurately transform such single Fock states into complex linear combinations of DĤ configurations encoding strong quantum correlations.

Effects of the system size N S
Let us now investigate how the increase in the system size can affect the expressibility and the resource efficiency of the minimal BS-Kerr ansatz.
In analogy to the previous study, we evaluate the number of layers N L required to satisfy F = 99%, but as a function of the total number of sites N S rather than the number of bosons N B . The results are presented in Fig. 14 for three values of interaction parameter (Λ = 0.01, 5 and 10) and a number of bosons N B = 2, 3 and 4. As readily seen in Fig. 14, the number of layers N L required to ensure a high fidelity scales linearly with the number of sites of the BH system (whatever the number of bosons N B is). Focusing on the strong many-body interaction regime (i.e. Λ = 10), a scaling of N L ∼ 0.5N S is obtained for N B = 2 bosons, N L ∼ 1.5N S for N B = 3 bosons, and N L ∼ 2N S for N B = 4 bosons. The prefactor in the linear scaling seems to slightly increase with the total number of bosons N B (a trend already observed Figs. 9, 12 and 13). Note that such linear behaviors strongly contrast with the non-linear increase of the number of configurations (i.e. the size DĤ of the Fock space) necessary to represent the exact ground-state of the BH model. Indeed, as an illustration, note that in the case of N B = 4 bosons the number of configurations involved starts from DĤ = 5 for the dimer (N S = 2) to reach DĤ = 126 for the BH chain with N S = 6 sites (which represents the maximum number of Fock states generated in our study). On another note about the weakly correlated regime (dark blue curves in Fig. 14), we see here again that a single layer is sufficient to ensure a high fidelity encoding of the BH ground state, for all the number of sites N S and number of bosons N B considered (see Appendix A for a mathematical proof). Finally, to summarize our observations realized so far, our results from the previous sections combined with the ones presented here reveal that the number of layer N L required to accurately encode the ground state of the BH model presents a linear trend with respect to both N B and N S . This being verified at least for the small to medium-sized BH systems considered herein.

Capacity of the interferometer-Kerr ansatz
Let us now turn to the interferometer-Kerr ansatz described in Sec. 3.2.3. By using an interferometer composed of more beam-splitter and rotation gates, the interferometer-Kerr ansatz is expected to perform as efficiently as the BS-Kerr ansatz but with a reduced number of layers, hence a reduced number of Kerr gates. As readily seen in the top panel of Fig. 15, only N L = 6 layers are necessary to encode more than 99% of the ground state of the three-site model for N B = 8, compared to the 10 layers required by the BS-Kerr ansatz (see Fig. 10). Note that reducing the number of layers in the interferometer-Kerr ansatz does not necessarily reduce the total number of gates compared to the BS-Kerr ansatz. Indeed, N L = 6 layers in the interferometer-Kerr ansatz is equivalent to 54 gates, while N L = 10 for the BS-Kerr ansatz gives 50 gates. However, the number of Kerr gates is always directly proportional to the number of layers and the num- ber of modes as N Kerr = N L × N S such that the number of Kerr gates is almost reduced by half when using the interferometer-Kerr ansatz (designed for this exact purpose). Note that the phase angles of the beam-splitter gates were not set to 0 compared to the BS-Kerr ansatz. For a more fair comparison, we set them to 0 in the second panel of Fig. 15. One can notice a slight increase in the number of layers, especially when the correlation strength Λ increases, but only up to N L = 7. Hence, for the three-site model, the interferometer-Kerr ansatz guaranties an efficient encoding of the ground state with N L ∼ 0.8N B compared to N L ∼ 1.25N B for the BS-Kerr ansatz, for all values of Λ.
We just saw that the total number of gates of the interferometer-Kerr ansatz can still be higher than of the BS-Kerr ansatz, despite a significant reduction in the number of Kerr gates. As mentioned in Ref. [112], the rotation gates of the interferometer are physically irrelevant for most applications. These gates are removed from the ansatz in the third and fourth panels of Fig. 15, where the phase angles of the beam-splitter are either free or set to 0, respectively. Looking at the fourth panel, it is clear that simultaneously removing the phase angles of the beam-splitter gates and the rotation gates deteriorates the performance of the ansatz, and does not lead to any improvement over the BS-Kerr ansatz. Even for Λ = 0.01, this is the only case where more than N L = 3 layers are required to accurately encode the ground state. However, removing the rotation gates while keeping the phase angle of the beamsplitter gates (third panel) gives similar results than doing the exact opposite, i.e. keeping the rotation gates and removing the phase angle of the beam-splitter gates (second panel). By doing so, the number of parameters remains the same, but the total number of gates can be reduced, for instance from 63 to 42 for N B = 8. However, for larger systems, there are more additional phase angles than rotation gates as the interferometer is composed of N S (N S − 1)/2 beam-splitters but only N S rotation gates, and getting rid of one or the other might lead to different results. As there are more beam-splitters than rotation gates, we expect that getting rid of the rotation gates is safe, especially as they are supposed to be physically irrelevant for most applications [112].
Turning to the four-site model in Fig. 16, we reach the exact same general conclusions as for the three-site model. It seems that the circuit depth required to encode the ground state of this model tends to N L ∼ 1.2N B , which again improves over N L ∼ 2N B for the BS-Kerr ansatz. Without the rotations and the phase angles (fourth panel), the number of layers tends to increase in all regimes of correlation, while keeping the phase angles and removing the rotation gates seems to provide the best trade-off between the total number of gates, the number of Kerr gates (proportional to the number of layers), and the number of parameters.

VQE simulations
In the previous section, we gave numerical evidence on the capacity of our proposed ansatze to encode the ground state of the attractive BH model. In this section, we switch to more practical calculations by simulating VQE protocols where the cost function to minimize is the energy of the system, For the sake of conciseness, we focus on the BS-Kerr ansatz only. First, we briefly demonstrate the accuracy of the results obtained on the three and four-site models. Second, we use the BH dimer as a test-bed to tackle the practical simulation of a VQE experiment on a CV device. This second part will be the occasion to compare the results obtained with an ideal realization of a VQE to a more realistic one, where the estimation of the energy E( θ) is based on a photon-counting protocol.

Ideal VQE simulations for the three-and four-site BH model
In this section, ideal simulations of the VQE algorithm are realized (i.e. infinite sampling, no photon loss, no error in the circuit. . . ). This means that we assume that we can extract the trial state |Ψ( θ =Û ( θ) |Ψ ini at the end of the circuit, which makes it possible to exactly estimate the energy E( θ) during the optimization process. at the end of the VQE algorithm for different Λ values (consistent with the simulations realized in the previous section). Note that N L = 6 layers and initial two-mode states (|Ψ ini = |2, 0, 2 and |Ψ ini = |2, 0, 1, 0 ) are considered. The depth of the circuits imposes to optimize 30 (respectively 42) parameters for the three-site (four-site) network.
As shown in Fig. 17, the probability of occupation follows the same trend as the previous ones in Figs. 12 and 13, where the infidelity I was minimized instead of the ground-state energy. In all cases of Fig. 17, the resulting states have fidelity F ≥ 99% and an energy error ∆E = E( θ * )−E 0 ≤ 10 −5 J (where θ * denotes the parameters that minimize the cost function), thus showing an excellent convergence of the VQE algorithm. Note that similar ideal VQE simulations have been also performed using single-mode initial states (i.e. |Ψ ini = |4, 0, 0 and |Ψ ini = |3, 0, 0, 0 , not shown), and led to similar accuracy. These results give a good demonstration of the ability of the BS-Kerr ansatz to map different initial states to the exact ground state of the attractive BH model (in all correlation regimes considered herein) in a more practical VQE context.

5.2
Ideal VQE v.s. realistic VQE with sampling noise: application to the BH dimer.
Let us now turn to the implementation of a more realistic execution of the VQE algorithm, as it could be done in a practical experiment.

Measure of the expectation value of the Hamiltonian by a photon counting approach
In practice, a realistic simulation of a VQE algorithm requires to measure the energy of a given trial state at the end of the quantum circuit. For this purpose, we propose a measurement protocol based on photon counting method which can be realized with photon number resolving (PNR) detectors [131,132]. Note that the development of PNR (and other similar techniques) is critical for the implementation of scalable quantum information processing (see the review [133] and references therein about photon detection methods for photonic quantum computing). These methods are still actively investigated to push the limit of maximum number of detectable photons, with a current maximum of around several tenths of photons that are resolved simultaneously.
In the following, the notation Ô Ψ ≡ Ψ|Ô |Ψ refers to the quantum average of an operatorÔ realized at the end of the circuit when the photons are in the final state |Ψ . Starting from the BH HamiltonianĤ defined in Eq. (1), measuring the energy of the final photonic state |Ψ of the circuit requires measuring two different contributions, and The one-body (hopping) term in Eq. (17) can be measured by introducing additional 50/50 beam-splitter gates (i.e. with φ = 0 and θ = π/4) between each pair of modes before measuring, thus leading to where |Ψ = B p,q (π/4, 0) |Ψ . Thus, evaluating the hopping term between two modes with respect to the original trial state |Ψ is equivalent to measuring the difference of average photon numbers in the two same modes after applying 50/50 beam-splitter.
Turning to the many-body interaction term in Eq. (18), note that it is directly related to the photon statistics of each individual mode at the end of the circuit: (21) where Var(n p ) = n 2 p Ψ − n p 2 Ψ is the variance of the photon number in mode p at the end of the circuit. In other words, to estimate the contribution of the many-body interaction term to the total energy, we only need to evaluate (over many samples) the first two moments of the number operator for each mode p.
Although not studied in this work, an extended version of the BH model -originally designed to study the supersolid phase of helium [94,95] -is sometimes considered (see for example Refs. [91,[134][135][136]) and readŝ where µ p is usually referred to as the "chemical potential", while the last term encodes the dipoledipole interaction of two packets of bosons localized on two different sites p and q. The expectation value of these two additional terms can also be estimated via photon counting. First, the expectation value of the chemical potential is relatively simple to estimate as it only requires to do local photon-counting on each mode. The energy contribution of the dipole-dipole term can be found by evaluating the covariance matrix of the photon number distribution in all the modes: n p n q = Cov(n p , n q ) + n p n q . (25) Note that setting p = q in Eq. (25) gives back Eq. (21).
Note that the extended Bose-Hubbard model has been shown to capture essential properties of many systems [137], including ultracold atoms and molecules in optical lattices [87,88,90,91,138,139], Josephson junction arrays [140,141], and narrow-band superconductors [142]. Testing our proposed ansatze on this model is a natural follow-up of this study and is left for future work.

VQE with and without sampling noise
Focusing on the BH dimer, we now investigate how the sampling of the energy E( θ) can affect the quality of the VQE results. Compared to the ideal VQE simulations realized above, considering sampling noise leads to fluctuations in the energy E( θ) that makes the optimization harder in practice. Fig. 18 shows the ground-state fidelity F obtained from VQE simulations with (right panel in blue) and without (left panel in green) sampling noise. In the latter case (with sampling noise), the energy is calculated using our photoncounting protocol described in Sec. 5.2.1. To mitigate the impact of noise in the classical optimization, we employed a "noise-proof" optimizer, namely CMA-ES [143]. The number of samples is fixed to 10 8 and photon-counting is only realized on the first mode of the circuit, as the conservation of the total number of bosons imposes N B = n 1 + n 2 (i.e. measuring the number of photons n 1 in the first mode gives directly the number of photons n 2 in the second mode, assuming no photon loss).
Let us first focus on the ideal VQE simulation (left panel of Fig. 18). In analogy with the previous study in Sec. 4, the ground state is encoded with high fidelity whatever the initial state and correlation regime considered. In all of these cases, a sub-linear limit marks the separation between the regions of high and low fidelityencoding (in average N L ≈ 0.6N B ). The slope of this limit actually grows with the strength of Λ, thus indicating that "cat-like" states that arise for strong many-body interactions can be the more complex case to encode with a VQE using the BS-Kerr ansatz.
Turning to the noisy VQE simulation (right panel of Fig. 18) based on our photon-counting protocol, one can see that the resulting groundstate fidelity strongly depends on the nature of the initial state. Indeed, for Λ ≤ 5, the results obtained with sampling noise are globally comparable to the ones from an ideal VQE. Interestingly, when the single-mode initial state is used, an even better fidelity is reached in this regime of weak correlation. However, for Λ = 5, the resulting fidelities deteriorate for both initial states, especially when the number of bosons increases. The deterioration is even more important when considering the strong interaction regime Λ = 10. In this case, a vertical limit separates the high-and low-fidelity regions at N B = 4 when the single-mode initial state is used. Over this limit, the fidelity of the VQE trial state gets stuck at F ∼ 50%. This occurs when the manybody interaction strongly intensifies, which leads to the ground state and first excited state to be quasi degenerate. This phenomenon arises as the many-body attraction essentially favors configurations that gather as many bosons as possible on a same site (as illustrated in Fig. 9, the most favorable configurations are |N B , 0 and |0, N B , followed by |N B − 1, 1 and |1, N B − 1 , and then |N B − 2, 2 and |2, N B − 2 , etc.). If we simplify by considering the most favorable configurations |N B , 0 and |0, N B , the resulting quasidegenerate ground and first excited state can roughly be seen as a "plus version" of cat-state |Ψ 0 ∝ |N B , 0 + |0, N B and a "minus version" |Ψ 1 ∝ |N B , 0 − |0, N B , respectively. In this case, running a noisy VQE simulation with an initial single-mode state |Ψ ini = |N B , 0 produces a final energy that stands in between both quasi degenerate states' energies (not shown here). As the sampling noise typically equates the amplitude of the spectral gap between both states, the minimization gets stuck here. The final fidelity of the trial states is then of 50% with both quasi degenerate states, indicating then that |Ψ(θ) turned into a linear combination of both of them. Note that a solution to converge to the exact groundstate energy is to increase the number of measures to reduce the fluctuations due to the sampling noise and resolve the small spectral gap. Therefore, starting from an initial single-mode state, cat-like states seem difficult to realize by the BS-Kerr ansatz when sampling noise is present. However, much better accuracy is reached when the two-mode initial state is used instead, thus suggesting that starting from a configuration with photons spread in between the modes can improve the results of the VQE in the strongly correlated regime (Λ ≥ 5 here). In summary, it seems from Fig. 18 that -within the BS-Kerr ansatz -a weakly-correlated state is more easily encoded when starting from a singlemode state rather than a two-mode one, and vice versa for a strongly-correlated state. Note that better results could possibly be reached by using other noise-adapted optimizers rather than CMA-ES used in this work, see Refs. [144,145] and references therein. If the optimization becomes a problem when tackling larger systems, one can decompose it into smaller pieces that are iteratively solved by small sized circuits, as proposed by Fujii et al. with the so-called deep VQE based on the divide and conquer approach [13].

Conclusions
In this work, we tackled the problem of encoding strongly correlated many-boson ground-state wavefunctions on a photonic quantum device. A particular attention was paid to the expressibil-ity and the resource efficiency of the circuits for different regimes of many-body interaction. For this, we designed two different quantum photonicbased ansatze and simulated a variational quantum algorithm (VQA) using the ground-state infidelity as a cost function. Both ansatze, the beamsplitter-Kerr (BS-Kerr) and the interferometer-Kerr, are based on layers of beam-splitter and Kerr gates, with additional rotation gates for the latter. We showed their ability to encode the ground state of the attractive BH model in all correlation regimes, for different number of bosons and initial states. While the BS-Kerr ansatz has shown to be the most efficient in terms of total number of gates and parameters, it might not be optimal for conducting a true experiment on a real photonic quantum device, as it contains many Kerr gates that are difficult to implement in practice. In contrast, the interferometer-Kerr ansatz was designed to reduce the number of Kerr gates (or the number of layers required to achieve a given ground-state fidelity) at the expense of more rotation and beam-splitter gates and more parameters per layer, and is therefore more appropriate for realistic applications. Although taking the fidelity as a criteria is theoretically sound to test the encoding ability of our ansatze, in practice the cost function is the ground-state energy measured within the variational quantum eigensolver (VQE) algorithm. As a last investigation, we performed realistic VQE simulations using the BS-Kerr ansatz, with and without sam-pling noise, and proposed a photon-counting protocol to measure the ground-state energy of the BH Hamiltonian as it would be done in a real experiment.
As further suggestions for future work, one could include noise and losses in realistic simulations of actual photonic circuitry, as well as time-frequency effects which affect non-linear optical elements such as squeezers and, presumably, Kerr media. Moreover, although the ansatze proposed in our work were designed to solve the BH model, they could in principle be used for a larger range of applications, such as the absorption of Helium-4 on graphite [146,147].
A Exact encoding of the BH ground state using only beam-splitters in the non-interacting limit We demonstrate here that a single layer of the BS-Kerr ansatz, without the Kerr gates (so only beam-splitters) can exactly encode the ground state of the BH model for the three newtorks considered in our study (i.e. with N S = 2, 3 and 4 sites) in the non-interacting limit Λ → 0. Note that this demonstration works only for initial single-mode states such as |N B , 0 , |N B , 0, 0 and |N B , 0, 0, 0 for the dimer, three-and foursite BH model, respectively.
Let us start with the dimer, for which the exact ground state reads The action of a single layer of beam-splitter gates on an initial single-mode state results in which provides an exact encoding of the ground state [Eq. (26)] if we fix the beam-splitter transmittivity parameter to θ = π/4. Turning to the three-site model, its exact ground state is given by applying the stair structure of the beam-splitters transform the initial single-mode state as follows, thus realizing an exact encoding of the ground state [Eq. (28)] when we fix the beam-splitter's parameters such that θ = arccos(1/ √ 3) and θ = π/4. Finally, the exact ground state of the foursite model reads Similarly as for the BH dimer and three-site models, we apply a single layer of the BS-Kerr ansatz (without the Kerr) on the single-mode initial state, thus leading to Fixing the beam-splitter's parameters to θ = π/3, θ = arccos(1/ √ 3) and θ = π/4 generates an exact representation of the exact ground state in Eq. (30).

B Action of the BS-Kerr ansatz on the BH dimer with 2 bosons
In this section, we illustrate the functioning of the BS-Kerr ansatz to solve the ground state of the Bose-Hubbard model. We focus on the simplest case possible: a dimer containing two bosons (i.e. N S = N B = 2). As a reference, we consider the exact form of the ground state for an intermediate interaction regime Λ = 6, which is given by For the illustration, let us consider an initial homogeneous distribution of the bosons in the two modes, i.e. an initial state |11 . Starting from this, we will demonstrate that a single layer of the BS-Kerr ansatz is sufficient to exactly encode the ground state of the system. The associated state generated by the circuit is here Focusing first on the BS-gate's action, we can show (using Eq. 11) that B(θ) |11 = cos(2θ) |11 + sin(2θ) √ 2 |02 − |20 .
This result clearly illustrates the role of the BS gate: it spreads the bosonic wavefunction over the three different accessible Fock states of the system. Applying Kerr gates on this quantum superposition of Fock state introduces non-linear phases on each independent Fock state such as = e i(θ 1 +θ 2 ) cos(2θ) |11 In this case, encoding the exact system's ground state with the three gates involved means solving the set of coupled non-linear equations to match the best the coefficients in 35 and 32. An exact solution of the problem is given by θ 1 = 3π/8, θ 2 = π/8 and θ = arccos(1/ √ 5)/2. Note that replacing Kerr-gates with simple rotation-gates makes the exact encoding of the ground state impossible. I: Computation of molecular spectra on a quantum processor with an error-resilient algorithm. In: Phys. Rev. X 8 (2018), Nr. 1, 011021. https://doi.org/10.1103/ PhysRevX.8.011021