A hybrid quantum algorithm to detect conical intersections

Conical intersections are topologically protected crossings between the potential energy surfaces of a molecular Hamiltonian, known to play an important role in chemical processes such as photoisomerization and non-radiative relaxation. They are characterized by a non-zero Berry phase, which is a topological invariant defined on a closed path in atomic co-ordinate space, taking the value π when the path encircles the intersection manifold. In this work, we show that for real molecular Hamiltonians, the Berry phase can be obtained by tracing a local optimum of a variational ansatz along the chosen path and estimating the overlap between the initial and final state with a control-free Hadamard test. Moreover, by discretizing the path into N points, we can use N single Newton-Raphson steps to update our state non-variationally. Finally, since the Berry phase can only take two discrete values (0 or π ), our procedure succeeds even for a cumulative error bounded by a constant; this allows us to bound the total sampling cost and to readily verify the success of the procedure. We demonstrate numerically the application of our algorithm on small toy models of the formaldimine molecule (H 2 C––NH).


Introduction
Conical intersections (CI) are degeneracy points in the Born-Oppenheimer molecular structure Hamiltonians, where two potential energy surfaces cross.Similar to Dirac cones in graphene [1], these intersections are protected by symmetries of the Hamiltonian which guarantee that any loop in parameter space around a conical intersection has a quantized Berry phase [2].
CIs play an important role in photochemistry [3,4], as they mediate reactions such as photoisomerization and non-radiative relaxation, which are key steps in processes such as vision [5] and photosynthesis [6].Therefore, detecting the presence and resolving the properties of CIs is important for computing reaction and branching rates in photochemical reactions [7,8].
Nevertheless, the study of such processes requires electronic structure methods capable of accurately modelling both the shape and the relative energies of the two intersecting potential energy surfaces, a requirement that poses challenges for the current available methods [9].Given the need to develop novel methods for identifying and characterizing CIs, quantum computers present themselves as a highly promising option for this task.
Quantum computing has long been driven by the desire to simulate interacting physical systems, such as molecules, as a novel means of investigating their properties [10,11].This is typically achieved by preparing eigenstates of molecular Hamiltonians in quantum devices which can natively store and process quantum states.This task would otherwise require an exponentially-scaling classical memory.Recently, with the first noisy and intermediate-scale quantum devices (NISQ) [12] being built, it became increasingly important to research tailored and robust algorithms that minimize the quantum device requirements [12].Variational quantum algorithms (VQA), such as the variational quantum eigensolver [13,14] (VQE) and its variations, caught the spotlight in this context, as they allow to prepare and measure quantum states with circuits of relatively low depth.The key feature of VQAs is the repeated execution of short parameterized quantum circuits on the quantum device, from which measurement results are sampled.These results are used to estimate a cost function, which is then minimized by varying the parameters defining the gates of the quantum circuit.Due to the noise introduced by sampling, a relatively large number of circuit runs and measurements are typically needed to estimate the cost function accurately.In chemistry, where VQEs are often proposed as a method to resolve ground state energies to high accuracy, the number of required samples to achieve such accuracy can become prohibitively large [15].Furthermore, the convergence of the cost function to an optimum is typically only suggested heuristically, and it is proven to be problematic in some cases that lack such heuristic structure [16].Therefore, it is compelling to suggest VQAs that can access quantities that are less reliant on the precision of both the optimization process and the measurement procedure.
A promising target for VQAs is the computation of the Berry phase Π C , which can be used to resolve the existence of CIs.More specifically, Π C is defined as the geometric phase acquired by an eigenstate of a parameterized Hamiltonian over a closed adiabatic path C in parameter space [2].Most importantly, it is known that in the presence of certain symmetries, the Berry phase will be quantized to values 0 or π.This quantization is exactly what makes the Berry phase an attractive target for a VQAs, as it implies the final result of the computation need only be accurate to error < π 2 .Quantum algorithms to compute Berry phases have been already proposed, both variational [17,18] and Hamiltonian-evolution based [19].Moreover, the long-known effects of Berry phase on nuclear dynamics around a conical intersection [20][21][22] have been explored recently in analog quantum simulation experiments [23][24][25].Nevertheless, previous proposals did not attempt to detect CIs in realistic quantum chemistry problems with an efficient algorithm that can be run on NISQ devices.
In this paper, we propose a hybrid quantum algorithm to compute the quantized Berry phase for ground states of a family of parameterized real Hamiltonians.We focus on the specific application to molecular Hamiltonians, where we can identify a conical intersection by measuring the Berry phase along a loop in atomic coordinates space.We first review the definition of CIs and Berry phases in Sec. 2.Then, we present all the ingredients of the proposed algorithm in Sec. 3, which is similar to a VQA in spirit, but it does not require full optimization; rather, the variational parameters are updated by a single Newton-Raphson step for each molecular geometry along a discretization of the loop.In Sec. 4, we prove a convergence guarantee for the algorithm under certain assumptions on the ansatz, by providing sufficient condition bounds on the total number of steps and the acceptable sampling noise.Finally, we adapt our algorithm to a specific ansatz in Sec. 5 and we benchmark it on a model of the formaldimine molecule H 2 C --NH in Sec. 6. Section 7 presents our conclusions, a discussion of potential application cases for our algorithm and an outlook on possible enhancements.
The core code developed for the numerical benchmarks, which provides a flexible implementation of an orbital-optimized variational quantum ansatz, is made available in a GitHub repository at https: //github.com/emieeel/auto_oo[26].

Conical intersections
Let us consider a molecular electronic structure Hamiltonian H(R) parameterized by the nuclear geometry R in some configuration space R. A conical intersection is a point R × ∈ R where two potential energy surfaces become degenerate, leading to non-perturbatively large non-adiabatic couplings, and thus a breakdown of the Born-Oppenheimer approximation [27,28].Conical intersections extend to a manifold of dimension dim[R]−2, and lead to the two potential energy surfaces taking the form of cones in the remaining two directions x, ẑ.These two potential energy surfaces can be described as eigenstates of the effective Hamiltonian The Pauli terms σ x and σ z form a complete basis for two-dimensional real symmetric matrices.Thus, a single conical intersection cannot be lifted by any real-valued and continuous perturbation of H eff (R); such a perturbation would only shift the value of R × .Moreover, the presence of such an effective Hamiltonian implies that in any direction other than R x or R z the energies must be degenerate.
Simulations involving conical intersections are challenging due to the degeneracy of the two states involved, as the character of both states needs to be considered.Active space methods are often used for organic molecules, which involve selecting the chemical bonds that are formed or broken in the reaction pathway as well as the most significant spectator or correlating orbitals [29].The situation is more complicated when transition metals are involved because the close energetic spacing of d-orbitals typically requires including all five d-orbitals of such a metal into the active space.An additional complication arises if the two crossing states correspond to different atomic configurations: a situation that is not uncommon for the early or late d-(or f-) metals, for which configurations with a different d-(f-) population are energetically close.In such cases, one may need to either work with non-orthogonal orbitals [30] or add an additional d-shell to the active space [31] to qualitatively describe the nature of both states.For cases of practical interest in which one wants to characterize and simulate the internal conversion processes in a complex photo-excited system, the presence of transition metals may easily lead to large active space requirements.These can not be met by classical algorithms and would be highly challenging for quantum algorithms as well.
One way to reduce the complexity of the problem is to first focus on the presence or absence of conical intersections that connect the ground and excited states.The measurement of the Berry phase in chemical systems allows this: without explicitly computing the excited state surface and non-adiabatic couplings, it should be possible to detect whether a loop in the nuclear coordinate space encloses a conical intersection or not.In this manner, one may alleviate the requirements for the active space selection and orbital optimization and quickly establish the region in the potential energy surface that contains an intersection with another surface and needs to be scrutinized further [32].Information about the location of conical intersections is of interest also for ground-state dynamics; the CIs and the Berry phase they induce influence the propagation of nuclear wave packets on the adiabatic ground state surface and thereby affect the branching rates and efficiency of reactions or isomerizations [33].For both types of applications, precise study of dynamics on ground state surfaces as well as characterizing the efficiency of radiationless decay, it is of interest to explore the possibilities offered by quantum algorithms.

Berry phases in real Hamiltonians
The Berry phase Π C is the geometric phase acquired by some eigenstate |Φ(R)⟩ of a system with parameterized Hamiltonian H(R) as it is adiabatically transported around a closed loop in parameter space C ⊂ R. Π C can be defined as the closed line integral of the Berry connection along the loop C The integrand must be imaginary, as In this work, we assume |Φ(R)⟩ is the ground state of H(R).
We need to parameterize the loop C = {R(t), t ∈ [0, 1]} in order to evaluate the integral.Moreover, it is possible to multiply the ground state |Φ(R)⟩ by a t-dependent phase, resulting in a U (1)-gauge a transformation which leaves all physical quantities invariant.We take which allows us to rewrite the Berry phase as If there is a representation for which each Hamiltonian H(R) is real, it is possible to choose eigenstates that have all real components.In this case, we can choose Θ(t) such that |Ψ(t)⟩ has real expansion, which implies the first integrand is real; as ∂ t ⟨Ψ(t)|Ψ(t)⟩ = 0 this must also to be imaginary, therefore ⟨Ψ(t)|∂ t |Ψ(t)⟩ = 0.Under this choice, we can evaluate the Berry phase as a boundary term where the last equality is obtained using the definition in Eq. (3).Furthermore, |Ψ(1)⟩ and |Ψ(0)⟩ are real by construction, which implies Π C can only take two values (modulo 2π): 0 or π.The quantization of Π C implies that it is invariant for topological deformations of C. If C can be contracted to a point, then Π C = 0.A non-trivial Π C = π can only occur when C encircles a degeneracy.One can check with the effective Hamiltonian Eq. (1) that any loop encircling R × has a Berry phase of π.This extends by continuity to any region of R around the CI, as long as C does not enclose a second degeneracy point.Thus, we have a one-to-one correspondence between CIs and the nontrivial Berry phase.

Measuring Berry phase with a variational wavefunction
There have been various proposals in the literature for computing Berry phases using a gate-based quantum device [19,17].In this work, we propose to use a variational algorithm to track |Ψ(t)⟩ when paralleltransported around the loop C, in the spirit of the variational adiabatic method described in Ref. [15,34].We approximate where |ψ(θ)⟩ is a variational ansatz state, and The angle θ * t is well-defined as long as the Hessian ∇ 2 θ E(t, θ) remains positive definite in a neighbourhood of θ * t for all t, ensuring the θ * t is continuous in t and non-degenerate.Although our treatment naturally extends to any variational ansatz that continuously parametrizes normalized states |ψ(θ)⟩ (including classical ansätze like e.g.matrix-product states), we assume the operator U (θ) is implemented by a parameterized quantum circuit (PQC) acting on an initial state |ψ 0 ⟩; this implies that information about the state needs to be extracted from a quantum device through sampling.

Methods
In this section, we detail all the ingredients needed to implement our hybrid algorithm to resolve quantized Berry phases with a variational quantum ansatz.Initially, in Sec.3.1, we discuss how selecting an ansatz that preserves the Hamiltonian's symmetries establishes a natural gauge, leading to the reduction of the Berry phase integral to the boundary term Eq. (5).In Sec.3.2, we introduce our parameter update approach, which employs single Newton-Raphson steps to trace the variational state along a discretization of the loop C. In Sec.3.3 we explain how to employ a basic regularisation technique to handle the potential non-convexity of the cost function and in 3.4 we explain how to measure the final overlap using an ancilla-free Hadamard test.Finally, in Sec.3.5, we provide a full overview of the algorithm.

Fixing the gauge with a real ansatz
As discussed in Sec. 2, the quantization of the Berry phase is granted by the symmetries of the Hamiltonian family H(R), which ensure the existence of a basis for which each H(R) has a real representation.When it comes to electronic structure Hamiltonians, it is always possible to find a real representation for time-reversal symmetric Hamiltonians with integer total spin [35].Moreover, real noninteger-spin Hamiltonians are also found throughout nonrelativistic quantum chemistry.
As our variational ansatz state [in Eq. ( 6)] is defined by a family of unitary operators, it inherits a natural gauge from U (θ).In particular, if U (θ) is written as a product of real rotations in the basis in which H(R) is real, then we force |ψ(θ)⟩ to have real components as well, which fixes a global U (1) phase.This can be obtained by constructing the PQC with a sequence of parameterized unitaries such as generated by antisymmetric operators A j that are real in the chosen representation.(We choose dimensional units such that ∥A j ∥ = 1 without loss of generality, see Appendix A).Examples from electronic structure include real fermionic (de-)excitations, such as unitary singles (A pq = â † p âq − â † q âp ) and doubles (A pqrs = â † p âq â † r âs − â † q âp â † s âr ).Many PQC ansätze commonly proposed for quantum chemistry, such as unitary coupled cluster (UCC) [36,13,37] and quantum-number preserving gate fabrics (NPF) [38], are composed from these elementary rotations.Formally, our ansatz state can then be defined as where U k are the aforementioned parameterized rotations applied in circuit-composition order.In this case, the Berry phase can be estimated by Π C = arg ⟨ψ(θ * t=0 )|ψ(θ * t=1 )⟩ , which corresponds to the boundary term in Eq. (5).This implies that, in the case of a non-trivial Berry phase, the path traced by θ * t will not close up on itself (i.e.θ * 1 ̸ = θ * 0 ), highlighting an important difference between the ansatz parameters θ, which fix the gauge of |ψ(θ)⟩, and the Hamiltonian parameters R, which define a ground state |Φ(R)⟩ up to a U (1) gauge freedom.However, the change in optimal parameters is insufficient to prove the existence of a nontrivial Berry phase; we must both successfully track the minimum θ * t as t traces from 0 to 1, and estimate the final overlap ⟨ψ(θ * 0 )|ψ(θ * 1 )⟩.While the argument (sign) of the overlap will yield the Berry phase, its absolute value is a proxy of success as it certifies the initial and final states are physically equivalent.

Avoiding full optimization via Newton-Raphson steps
As mentioned above, the Berry phase Π C is a discrete quantity, and therefore we only need to estimate it to accuracy < π 2 .In Appendix A, we show that this implies that we can accept an error on the estimate θ1 of the final optimum θ * 1 bounded in one-norm by ∥ θ1 − θ * 1 ∥ 1 < 1.Thus, we are not required to exactly track |ψ(θ * t )⟩ and as a result, the variational energy Eq.(7) does not need to be fully re-optimized at every time-step t.Instead, it suffices to keep the estimate θt of the optimal parameters within the basis of convergence of the true minimum θ * t .To achieve this, we still need an initial optimum as an input, which is obtained by running one full optimization.If possible, the initial point is selected such that optimization is simplest.Then, we propose to use a single step of the Newton-Raphson algorithm at points t ∈ {∆t, 2 ∆t, . . ., 1 − ∆t, 1}.
The Newton-Raphson (NR) algorithm determines the update of the estimate of the minimum θ through the gradient G(t, θ) := ∇ θ E(t, θ) and Hessian H(t, θ) := ∇ 2 θ E(t, θ) of the variational energy Eq.(7).The derivatives can be computed using either finite-difference methods or parameter-shift rules [39]; either method requires sampling the variational energy E(t, θ) at a number of different parameter points θ.Given the estimate θt of the optimum at point t as an initial guess, the NR step with cost E(t + ∆t, θ) prescribes the update θt+∆t = θt + dθ NR t,∆t , with The Newton-Raphson method is well-known to have a finite-sized basin of quadratic convergence, as long as the cost function is strongly convex at the optimum, In other words, the lowest eigenvalue m of the Hessian of the cost function (which we call convexity) at the optimum θ * needs to be positive.Details on the convergence properties of NR are given in App.D. In section 4, we show that this quadratic convergence is fast enough to keep track of the minimum with only a single step for each t-point.

Regularization and backtracking
To ensure that we successfully track the minimum of the cost function θ * t , the estimates θt need to remain in the strongly convex region of optimization space.The existence of such strongly convex region is not always guaranteed since it depends on the ansatz.A common cause of failure of this requirement is exemplified for ansätze with (local) over-parametrization of the state manifold.In fact, if the ansatz has redundant parameters the Hessian of the cost function Eq. (7) will always be singular (m = 0).Then, arbitrarily small perturbations of the cost function can then cause m < 0. When this occurs, the inversion of the Hessian needed for the Newton-Raphson step is ill-defined.
The most direct approach to solve this issue is to select an ansatz with no degeneracies, facilitating a strongly convex cost function at its minima.Nevertheless, this is only possible for very simple problems, and even in this case, quasi-degeneracies can make the convergence region extremely small.Since this is a well-known problem, many alternative solutions have been proposed in the literature.In this subsection, we will explore and implement two of themback-tracking and regularization.
Back-tracking -Small positive eigenvalues of the Hessian can cause the standard Newton-Raphson step to overshoot along the relative parameter eigenmodes.This effectively reduces the size of the neighborhood of the minimum θ * in which quadratic convergence is granted (see Appendix D).Since for positive convexity the direction provided by the Newton-Raphson step is guaranteed to be a descent direction, we can mitigate this overshoot by implementing line-search of the minimum on the segment defined by the NR step.In the common variant of back-tracking, the Newton step is iteratively damped by a constant β ∈ (0, 1).At each iteration, the cost function in the new point is measured, until the cost function is reduced enough (the detailed condition is given in Algorithm 1).While some additional evaluations of the cost function are needed, the (more expensive) gradient and Hessian are only calculated once.Due to the repeated evaluation, one needs to consider extending line-search methods to cost functions evaluated with sampling noise.
Regularization -For realistic ansätze and Hamiltonians, it is difficult to avoid (quasi-) redundancies in some regions of parameter space.In this case, the cost function might not be strongly convex around the minimum, or the convexity might be too small to ensure a sufficiently-large convergence region.Furthermore, even for an ideally-convex cost function, noisy evaluation on a quantum device might result in a distorted Hessian with non-positive eigenvalues.To mitigate this issue, we can use a regularization technique that penalizes the change in parameters along the quasi-redundant directions.We propose to use augmentation of the Hessian to regularize the NR step, obtaining a so-called quasi-Newton optimizer.Hessian augmentation is a common practice in quantum chemistry methods that feature orbital optimization, such as self-consistent field methods [29].If the smallest eigenvalue of the Hessian λ 0 is smaller than a positive threshold convexity m thr , we construct the augmented Hessian as follows where we add a constant ν > |λ 0 |.The augmented Hessian is then positive, and we can realize the NR update as Here, β is the damping constant from back-tracking line search and G is the gradient as in eq.(10).Regularization and back-tracking are typically used in tandem, as quadratic convergence is harder to guarantee when using regularization.The choice of ν is nontrivial: we want it to be large enough to suppress parameter changes along quasi-redundant directions, but we need to avoid exaggerating the damping along relevant directions.Common solutions include choosing ν = ρ|λ 0 | + µ with fixed positive constants ρ and µ, or using a trust-region method [40,41] where the Newton step is constrained to lie within a ball of some radius h (such that ||dθ NR || 2 ≤ h).
The augmented Hessian method does not require further evaluations of the cost function, although the damping of the parameter updates might imply that more t-steps are needed to successfully resolve the Berry phase.

Measuring the final overlap
For a real ansatz, the overlap of the tracked states at t = 0 and t = 1 must be real and it can be rewritten as This quantity can readily be measured by a Hadamard test, implemented as The required number of samples is small, as we only need to resolve whether the sign of the overlap is +1 Implementing the circuit above requires to realize controlled-U (θ), which might increase significantly the depth of the compiled quantum circuit and make the implementation unfeasible on near-term hardware.However, this requirement can be bypassed by using the control-free echo verification technique [42,43], in place of the standard Hadamard test, to sample Eq. ( 14).This technique requires access to a reference state |ψ ref ⟩ orthogonal to |ψ 0 ⟩, which should acquire a known eigenphase φ under the action of the PQC, U (θ)|ψ ref ⟩ = e iφ |ψ ref ⟩.As most of the PQC ansätze used for electronic structure states preserve the total number of electrons (including UCC and NPF), the fully unoccupied state |0...0⟩ can be used as reference.Control-free echo verification circuits only require implementing the non-controlled U (θ), and furthermore provide built-in error mitigation power.

Overview of the algorithm
We are now ready to formalize the proposed algorithm for resolving Berry phases, Algorithm 2. The formalization we present here will allow us to bound the number of steps and the sampling cost in the following Section 4. Given a path C and a number of steps 1/∆t, the algorithm attempts to calculate Π C yielding either Π C = 0, Π C = π, or a FAIL state.Again, in Section 4 we will bound the probability of the FAIL state occurring.Additional features that extend the practicality of the algorithm and mitigate the failure cases are presented in Sec. 5 and later implemented in Sec. 6.
If the algorithm fails, it can be re-run with a larger number of steps N and thus a smaller step size ∆t.A smaller step size decreases additive NR error bound (the error per step scales as ∆t 2 , the total bound thus scales as ∆t).

Error analysis and bounding
In this section, we find analytic upper bounds on the cost of estimating a quantized Berry phase Π C on Algorithm 2: Resolve quantized Berry phase Input: ← define cost function as in Eq. ( 7); Gj ← sample the gradient to precision σ G Gj = ∂E ∂θj (τ + ∆t, θτ ) ; Hjk ← sample the Hessian to precision σ H f ← final overlap as in Eq. ( 14) to precision F ; if f 2 < F then return FAIL and exit; return Π C = arg{f } a fixed curve C using Algorithm 2. To simplify the treatment regularization and back-tracking are not considered: instead, we require each estimate θj ∆t at the j-th step to be within the region of quadratic convergence of the cost function at the next t-step E((j + 1)∆t, θ).We prove that this translates to a guarantee of convergence of the algorithm, under three conditions: 1.At the local minimum θ * t , where ψ(θ * t ) approximates the ground state, the cost function E(t, θ) is strongly convex [as described in Eq. ( 11)]; 2. The number of discretization steps N is sufficiently large; 3. The sampling noise on each of the Hessian and gradient elements (σ H and σ G respectively) is sufficiently small.
The first point entails a requirement on the cost function, defined by the family of Hamiltonians H(R) and the choice of ansatz |ψ(θ)⟩.This requirement is not satisfied if the ansatz state is defined with redundant parameters.We contend that, while strong convexity is a significant assumption, incorporating regularization (or one of the other techniques suggested in the outlook) can alleviate the necessity for such an assumption in practical applications.Our proof provides upper bounds on N and lower bounds on σ H and σ G , which suffice to grant convergence.However, these are not to be considered practical prescriptions, as we do not believe them to be optimal; rather they show which are the relevant factors playing a role in the convergence of the algorithm.As the sampled gradient and Hessian are random variables, the guarantee of convergence for bounded error is to be understood in a probabilistic sense.
We first clarify natural assumptions and notation used in the calculation of the basin of convergence of Newton's method.We require the cost function E(t, θ) to be twice-differentiable by θ, for all t, in a region around the true minima θ * t .We require the Hessian to be Lipschitz continuous across this region, (Here, the Lipschitz constant L can be considered a bound ∥T ∥ ≤ L on the norm of the tensor of third derivatives

Bounding the NR error
We will first calculate a lower bound on ∆t that ensures the error δ θt = θt − θ * t is bounded by a constant for all values of t. (as shown in Appendix A, a bound on ∥δ θ1 ∥ 1 is sufficient to ensure Π C can be accurately resolved.)In this calculation, we will allow for an additive error σ θ on θt due to sampling noise; we will simultaneously calculate an upper bound on σ θ .We sketch the calculation here and defer details to App.D.
Firstly, it can be shown (Theorem 1 in Appendix D) that the Newton-Raphson step Eq.(10) with cost function E(t + ∆t, θ) is guaranteed to converge quadratically [44] to the minimizer θ * t+∆t as long as the initial guess θt is within a ball centred in the minimizer of radius m thr 4L .Quadratic convergence means that the distance of the updated guess from the minimizer will scale as the square of the distance of the initial guess from the minimizer, The right-hand side of this equation can be bounded through the triangle inequality as We can bound the second term in this equation by taking the total t-derivative of the optimality condition G(t, θ * t ) = 0, yielding If at step t we are within the radius given by Theorem 1, the NR step will quadratically converge, suppressing also the error from the previous step, and yielding ∥δ θt ∥ ≤ To account for sampling noise effect, we then consider a small additive error σ θ to θt .Maximising this allowed sampling noise at each step (as we will see, this becomes the bottleneck in our method) then yields the two bounds When both bounds are satisfied, we are guaranteed single steps of Newton's method will maintain convergence around the path C.

Bounding the sampling noise
We now translate the bound on σ θ to bounds on the variance of estimates of each element of the gradient and Hessian (σ 2 G and σ 2 H respectively).This proceeds by simple propagation of variance through Eq. (10).We find where δG and δH are the random variables representing the errors on gradient and Hessian.(a more detailed calculation is given in App.E.) Assuming θ has n p elements, each element of the gradient is i.i.d. with variance σ 2 G , we get As δH is a n p × n p real symmetric matrix, assuming its elements are i.i.d. with variance σ 2 H , we can invoke Wigner's semicircle law [45] to approximate its norm by √ n p σ H , thus Combining these with Eq. (20), and requiring the resulting variance to be small compared to the square allowed additive error σ 2 θ we obtain the bound We can then bound the norm of the NR update as ∥dθ NR t,∆t ∥ ≤ m −1 thr ∥G∥.Splitting the error budget in half we obtain.
Note that these bounds are not tight.For instance, by applying Cauchy-Schwartz inequality to bound ∥H −1 • G∥ ≤ ∥H −1 ∥∥G∥, we overlook the fact that the gradient will change more slowly along a lowereigenvalue eigenmode of the Hessian.We believe further work might allow to define tighter bounds.

Scaling of the total cost
To give an estimate on how many measurements we need to sample gradient and Hessian to sufficient precision, we need to recast the quantities in Eq. ( 25 p ∥H∥ (where ∥H∥ is the spectral norm of the Hamiltonian).The number of measurements to sample the Hessian to precision Eq. (25) are proportional to the inverse of the bound, with proportionality constant M H indicating the number of shots required to sample a single element of the Hessian to unit variance (this depends on details such as the decomposition taken to measure the Hamiltonian, and the specifics of the derivative estimation method).Multiplying this by the number of steps 1 dt [Eq.(19)] gives us the total number of shots required for convergence

Adapting to an orbital-optimized PQC ansatz
To achieve a good representation of the ground state character while minimizing depth and number of evaluations of quantum circuits, we employ a hybrid ansatz composed of classical orbital rotations and a parameterized quantum circuit (PQC) to represent correlations within an active space.The concept of an orbital-optimized variational quantum eigensolver (OO-VQE) is explored in [46,47], as an extension of VQE conceptually similar to a complete active space self-consistent field (CASSCF) calculation.In this section, we introduce the construction of the OO-PQC ansatz and discuss its specific use in our algorithm, where the orbitals need to continuously track a changing active space depending on the nuclear geometry.

An OO-PQC ansatz with geometric continuity
To represent the electronic structure state, we start by choosing an atomic orbital basis, i.e. a discretization of space defined by a set of N non-orthogonal atomic orbitals χ µ (R, x) (functions of the electronic coordinate x ∈ R 3 , where we make explicit the parametric dependence on the nuclear coordinates R); these orbitals define the overlap matrix The atomic orbitals (AOs), along with the overlap matrix, depend on the geometry of the molecule specified by the nuclear coordinates R. (For the sake of simplicity, we limit to considering real AOs.)From these, we could define a set of parameterized orthonormal molecular orbitals (MO) which would allow for the definition of a parameterized active space.The downside of this parametrization is that, to ensure MO orthonormality, we need C AO to satisfy the constraint which depends nontrivially from R. This implies that we cannot trivially use the same C AO for different geometries R.
In order to address this problem, we have opted to use orthonormalized atomic orbitals (OAO) that are derived from the AOs through symmetric Löwdin orthogonalization [48] as reference in the definition of parameterized MOs.The OAOs are defined as Building on these, we can define the MOs as where C = S 1/2 (R) • C AO .The orthonormality constraint Eq. (31) then reduces to requiring C to be orthogonal, and it is independent on R. In summary, Eq. (32) defines a set of orthonormal molecular orbitals parameterized by C, well-defined and continuous for changing R.
To start up our algorithm, the matrix C AO can be initialized by a Hartree-Fock (or any other molecular coefficient matrix, e.g.coming from a small CASSCF calculation) at some initial geometry R(0).From this we recover C = S 1/2 (R(0)) • C AO , which is then treated as a variational parameter of the ansatz.Using the parameterized MO Eq. (32) we construct the electronic structure Hamiltonian H(R, C) in the (parameterized) molecular basis.
Based on the initial Hartree-Fock orbital energies, we split the N orbitals into a core set with N O doublyoccupied orbitals, an active set with N A orbitals, and a virtual set with N V empty orbitals.Although the split of the orbital indices remains constant throughout the algorithm, the orbitals themselves continuously change through their dependence on R and C. The correlations are treated only within an active space of η A electrons in N A orbitals.Tracing out the core and virtual orbitals yields the active-space Hamiltonian H A (R, C).
The correlated active-space state |ψ(θ)⟩ is represented on a quantum device, using a PQC ansatz of the form Eq. ( 9).The cost function then becomes and it can be evaluated by sampling the 1-and 2electron reduced density matrix (RDM) of the state [49].(Other efficient sampling schemes, e.g. based on double factorization [50,51], can be used.)

Measuring boundary terms with the OO-PQC ansatz
When evaluating the final overlap [Eq.(14)] with an orbital-optimized ansatz, we have to consider that the states |ψ(θ 0 )⟩ and |ψ(θ 1 )⟩ are defined on different active space orbitals, determined by the MO matrices C 0 and C 1 respectively.The transformation between the two sets of orbitals, ϕ(R, is represented by the orthogonal matrix If the algorithms successfully tracked the lowestenergy active space state of the system, the Hilbert spaces spanned by the active orbitals defined by C 0 and C 1 should match.(The same is true for the core space and the virtual space.)This implies the matrix C 0→1 will have block structure, with [C 0→1 ] pq ̸ = 0 only if p, q are both in the same set of orbitals (core, active or virtual).The orbital rotations within the core (virtual) subspace will not generate any phase on the state of the system, as all orbitals are doublyoccupied (doubly-unoccupied).The orbital rotation within the active space can then be translated to a unitary transformation on the state by a Bogoliubov transformation with c † p , c p fermionic creation and annihilation operators on the p orbital.The final overlap can then be sampled with a Hadamard test, given an quantum circuit implementing the (eventually controlled) operation G 0→1 .Under Jordan-Wigner encoding, a quantum circuit for G 0→1 can be implemented as a fabric of parameterized fermionic swap gates of depth N A following a QR decomposition of the orbital rotation generator [log(C 0→1 )] pq , also known as a givens rotation fabric [52].These gates preserve the zero-electrons reference state, allowing to employ the ancilla-free echo verification technique mentioned in section 3.4 to measure the final overlap.

Newton-Raphson updates of the OO-PQC ansatz
The proposed OO ansatz has two sets of parameters, C and θ.As the MO matrix C is subject to the constraint Eq. (31), its elements cannot be freely updated with NR.Instead, for each NR update with initial MO matrix C, we reparametrize the MOs with a unitary transformation: C ← Ce −κ , where κ is any antisymmetric matrix.The derivatives of the energy with respect to any element κ pq can be evaluated analytically (see Appendix C).Furthermore, under this parametrization it can be shown that κ pq where p, q are both core indices or both virtual indices are redundant [29] in the definition of the active space orbitals; these N 2 O + N 2 V parameters are set to zero without reducing the expressivity of the ansatz.We call the unraveled set of remaining parameters κ.To implement the Newton-Raphson step, the gradient and Hessian with respect to the combined set of parameters (θ, κ) is computed.In this manner the gradient splits into two components, and the Hessian into three components The PQC parameter derivatives ∇ θ E and ∇ 2 θ E can be evaluated through parameter-shift rule [39,53] or finite difference, by sampling on the quantum device.The derivatives with respect to the OO parameters ∇ κ E and ∇ 2 κ E are linear functions of the 2-electron RDM, whose coefficients can be computed analytically [29].The remaining component, the mixed Hessian ∇ κ ∇ θ E can be similarly evaluated as a linear function of θ-derivatives of the RDM; for this, we can use the same data sampled from the quantum device to evaluate ∇ θ E. We detail the procedure of estimating these Hessian components in App. C. Thus, evaluating the derivatives with respect to the OO parameters does not require extra sampling on the quantum device.

Numerical results
In this section, we demonstrate the application of our method to a small model system: the formaldimine molecule H 2 C --NH, an established model in the context of quantum algorithms for excited states in [46,47,54].This molecule is known to have a conical intersection between the singlet ground state and first excited state potential energy surfaces [55].This CI plays an important role in the photoisomerization process of formaldimine, which in turn can be considered a minimal model for the photoisomerization of the rhodopsin protonated Schiff-base (a key step in the visual cycle process [56,57]).We consider geometries obtained from the equilibrium configuration by varying the direction of the N -H bond, defined by the bending angle α and the dihedral angle ϕ (see Fig. 1d).Varying these angles defines the considered plane in nuclear configuration space R. First, we consider a minimal model of formaldimine (within the minimal basis and a small active space), on which we can test the properties of the algorithm 2.Then, we investigate the effects of sampling noise on these results.Finally, we study a more complex model of the same molecule (with a larger basis set and active space), and show that we can achieve similar results by employing regularization and backtracking to deal with the degeneracies of the ansatz manifold.

Numerical simulation details
For our simulations, we use the PennyLane [58] package with the PyTorch backend to construct the hybrid quantum-classical cost function of Eq. (33) supporting automatic differentiation (AD) with respect to the PQC parameters.To achieve this, we implement the transformation of the one-and two-body AO basis integrals to the parameterized MO basis [Eq.(32)] with AD support.The transformed integrals are projected onto the active space and contracted with the active space one-and two-electron RDM.The RDM elements and their derivatives with respect to the PQC parameters are obtained using PennyLane and its built-in AD scheme based on the parameter shift rule.In summary, gradients and Hessians of the cost function with respect to the PQC parameters are obtained using AD, the orbital gradient and Hessian components are estimated analytically (see Appendix C), and the off-diagonal block of the composite Hessian [Eq.(38)] is retrieved by automatic differentiation of the analytical orbital gradient.We use PySCF [59] to generate the molecular integrals (i.e. the full space one-and two-electron integrals and overlap matrices) in the atomic orbital basis.
The core code developed for this project is made available as a python package in the GitHub repository [26].This code provides a flexible implementation of the orbital-optimized PQC ansatz, which can find many applications in VQAs for chemistry.A tutorial Jupyter notebook showcasing a calculation of Berry phase in the minimal model of Formaldimine is provided in the examples folder in the repository.

Minimal model with an degeneracy-free ansatz
We first demonstrate the application of our algorithm to a minimal model of formaldimine, for which we can approximate the ground state with a simple ansatz with no degeneracies.The molecule is described in a minimal STO-3G basis-set, and we select an active space of η A = 2 electrons in N A = 2 spatial orbitals [i.e.CAS(2,2)].As the orbital optimization already allows (spin-adapted) single excitations within the active space, the only parameterized gate we can include in our PQC ansatz is the double-excitation c0c1) ; this corresponds to the unitary coupled-cluster doubles [UCC(S)D] ansatz, where the singles (S) are not explicitly included because they would be redundant with the orbital optimization.This is enough to describe exactly any active space state compatible with the symmetries of the model, without over-parametrizing the ansatz state.
In Fig. 1 we demonstrate the application of our algorithm to this model.The minimal basis set is small enough that we can run a full configuration interaction (FCI) calculation to exactly resolve the ground and first excited state energies E 0 (R) and E 1 (R).Observing the gap E 1 (R)−E 0 (R) (portrayed in Fig. 1a), we can determine the location of the conical intersection ϕ × = 90 • , α × ≈ 132 • .We then define three loops in the configuration space R, one loop C × containing the CI and two "trivial" loops C 1 , C 2 .These loops are centered around ϕ = 90 • and α = 110 , and all have a radius of 10 • .Fig. 1c shows the progress of the estimate θt (shifted by θ0 ) of the optimal PQC parameter θ * t throughout the N = 25 single-NR-update t-steps.Note that, while we only plot the single PQC parameter relative to the parameterized double excitation, the algorithm updates the MO coefficients Ct as well.We observe that θ1 = θ0 for the trivial loops C 1 , C 2 , while θ1 ̸ = θ0 for the loop C × containing the CI.This is an indication of the effect of the Berry phase, but not the result of the algorithm yet; measuring the overlap Eq. (36) yields the correct Berry   1b.We can observe a small deviation from the optimal energy in the region where the character of the state changes faster (along the line ϕ = ϕ × , α < α × ), but this does not disrupt the tracking of the minimum.Finally, Fig. 1e shows that a number of discretization points N ≥ 9 is needed to correctly resolve Π C× = π, through the evaluation of the overlap [Eq.(36)] ω C× = −1.

Sampling noise
In this section, we explore the robustness of our algorithm with respect to the sampling noise characteristic of VQAs.We directly add a proxy of sampling noise η to each element of the gradient and Hessian.Each η is an independent Gaussian random variable with variance σ 2 ; a different η is added to each element of the gradient of (37) and Hessian of Eq. (38) to get the noisy estimates G and H.In this way, one avoids defining a specific sampling strategy, keeping our results general.
In Fig. 2 we show the energy profile of the three loops whose geometry is represented in Fig. 1a, for one such random realisation of the sampling noise.(The plotted energy expectation is evaluated exactly, noise is only added to the gradient and Hessian used in the NR updates.)These three loops yield the same Berry phase results as the noiseless case.
The probability P success of Algorithm 2 correctly resolving the Berry phase Π C× on the nontrivial loop C × is reported in Fig. 3, as a function of the number of discretization steps N and of the variance of the added noise on each sampled quantity σ 2 .The expected final overlap Eq. ( 36) is −1 for this case, as the loop contains a CI.For each value of N and σ 2 , we simulate 100 noisy runs of the algorithm and we declare as successful the ones that yield a negative final overlap (implying Π C× = π).Finally we average these outcomes to retrieve the succes probabilities P succes .
From these simulations we conclude that sampling noise reduces the accuracy of tracking the ground state and thus increases the probability of obtaining inaccurate energies.Nevertheless, for a moderate amount of sampling noise our algorithm still resolves the Berry phase correctly.We observe that an error on each gradient and Hessian element with variance of σ 2 = 10 −5 (or smaller) does not compromise the resolution of the Berry phase, as long as the number of discretization points is sufficiently large (N > 10, very close to the noiseless case portrayed in Fig. 1e).On the other hand, a large enough sampling error (σ 2 ≥ 5 × 10 −5 ) produces essentially random results (P success ≈ 50%).
The number of shots required to achieve a given accuracy on gradient and Hessian depends on the chosen measurement optimization scheme and the technique used to estimate derivatives.We note that the energy can be expressed as a linear function of one-and two-electron RDMs, where ξ = pq, pqrs ∈ [N A ] are the one-and two-body active-space orbital indices, D(θ) ξ are the RDM elements and h(t) ξ are the one-and two-electron integrals.The RDM elements can be grouped and measured together with optimized shot allocation [49,[60][61][62][63], with the potential for further optimization based on basis selection [64] or on a more advanced factorization of the cost function [51,65].Derivatives are also linear in the RDMs evaluated at different parameter values (see App. C), and they can be reconstructed either through finite-difference methods or using pa-rameter shift rules [39,66].
To provide an upper bound on the number of shots necessary to achieve the required accuracy in our numerical benchmark, we consider an un-optimized measurement scheme which samples each element of the RDMs separately.In this scheme, optimal resource allocation requires each D(θ) ξ to be sampled with a number of shots proportional to |h(t) ξ | [67].This allows to estimate the cost function to accuracy σ with a number of shots upper bounded by 2  1 , with || ⃗ h(t)|| 1 = ξ |h(t) ξ | the total one-norm of the electronic integrals.The Gradient and Hessian with respect to a single parameter are estimated through the parameter shift rule only requiring 3 parameter points.The same RDM samples can be used to estimate all elements of the mixed gradient and Hessian, and we assume that the one-norm of the derivatives of h with respect to the orbital rotations are of the same magnitude as || ⃗ h(t)|| 1 .In our simulation benchmark (on the path C × ), the one-norm of the integral takes values up to max t || ⃗ h(t)|| 1 = 1.5376.Thus, we estimate an upper bound on the total number of shots of the order of ≈ 3 × 2.4σ −2 .

Larger basis and active space
To test convergence for a more realistic case where the cost function is not always strongly convex at its minima, we simulate the algorithm on a more challenging model of formaldimine.The model is constructed employing the cc-pVDZ basis set (43 atomic orbitals), and an active space of four electrons in four spatial orbitals [CAS (4,4)].As a PQC ansatz for the active space state, we use the number-preserving fabric (NPF) ansatz introduced in [38], consisting of a fabric of spin-adapted orbital rotations and double excitations on sets of two spatial orbitals (four spinorbitals).Four layers of this ansatz are enough to recover the exact CASCI ground state energy inside the active space of 4 orbitals, resulting in 20 PQC parameters.This is an overparameterization of the ground state, implying a global redundancy in the ansatz and resulting in a singular hessian at every point.The goal of this numerical demonstration is to show that Algorithm 2 can still recover the Berry phase, in this case using regularization and backtracking.
In Fig. 4 the energies throughout two loops are shown.The location of the conical intersection (α × , ϕ × ) in the larger basis set moves compared the case shown in Fig. 1 (this is to be expected, as the cc-pVDZ and STO-3G models are effectively different); the basis is now too large to attempt an FCI calculation that would resolve the gap exactly.One could instead resort to a State-Averaged CASSCF calculation to resolve the location of the CI, however, the state-average approach might bias the location of the CI.For this demonstration, we manually select two 0.0 0.5 1.0 t -94.00 -93.99 -93.98 -93.97  1).The basis set is too large to run a FCI calculation to locate the CI, and using another approximate method (e.g.state-average CASSCF) might bias the CI location.Instead, we manually choose larger loop geometries and use Algorithm 2 to find the value of ΠC.
loops with a slightly larger radius of 15 • , centered around α = 113 • (C × , red line) and around α = 145 • (C 1 , blue line).These values are chosen based on the location of the CI at the level of theory of large state-average CAS(14,14)SCF calculation, which returns α × ≈ 113 • .We choose ϕ = 90 • , as the CI is forced to lie on the ϕ × = 90 hyperplane due to the Cs reflection point-group symmetry.Due to the single-step Newton-Raphson optimization, we can observe a difference between the energies evaluated at each iteration point of our algorithm (circle markers) and the variational optima (solid lines).The discrepancy is more pronounced after a large change in the state character, such as happens just after t = 0.5 in the topologically-nontrivial loop C × .There, the variational parameters need to change more rapidly to track the state.As long as the discretization step is small enough, the single Newton-Raphson steps ensure the variational state keeps tracking the local minimum.Indeed, we can resolve the correct Berry phase with only N = 50 discretiation points, recovering a final overlap [Eq.(36)] of ω C× = −0.9994(loop containing the CI), and ω C1 = 0.99998 (trivial loop).

Conclusion and outlook
In this work, we introduced a hybrid algorithm to resolve conical intersections through the Berry phase they induce.This is achieved by tracking the ground state with a variational quantum ansatz, along a closed path C in nuclear configuration space.This algorithm only requires approximating the ground state (in contrast to e.g. the state-average VQE [46]) for one nuclear geometry R at a time, reducing the expressivity requirements of the ansatz.The key re-quirement of the algorithm is that the ansatz parameters are changed smoothly, tracking a local minimum of the cost function (7) and ensuring the U (1)gauge (global phase) of the ansatz state remains welldefined.
As the output is quantized (Π C ∈ {0, π}), the result only needs to be estimated to a constant precision, and the algorithm is robust to some amount of error; we considered optimization error (the error on estimates of a variational parameter, due to the approximate minimizer) and sampling noise on the quantities measured on the quantum device.We showed that one can update the variational parameters with a single newton step for each geometry in a discretization of the loop C. We proved analytically that the algorithm is granted to converge for a large enough number of discretization steps N , and small enough additive error, under the assumption that the cost function is strongly convex at its minimum.(We consider sampling noise explicitly, but the robustness results extends to any additive noise, including hardware noise.)We argue this result practically extends to cases where the strong convexity assumption is not satisfied, as long as some regularization technique is employed.
This reasoning is corroborated by numerical demonstrations of CI resolution on a small example system -the formaldimine molecule.Using a minimal description of formaldimine [STO-3G basis, a CAS(2,2), UCCD ansatz], for which we have a strongly convex cost function, we show convergence of Algorithm 2 without using regularization for a sufficiently large N > 11, as we expect from our analytical results.We also demonstrate the effect of sampling noise in this setting, showing that our algorithm is robust to a sizable amount of noise, achieving convergence for N comparable to the noiseless case.Finally, we demonstrate the application of Algorithm 2 with regularization on a more complicated and realistic model of formaldimine [cc-pVDZ basis, CAS(4,4), NPF ansatz].This case shows that, even with a cost function that is never convex, we can employ regularization to resolve the Berry phase correctly.

Paths towards improving convergence
The key step in our algorithm, where most of the cost in terms of quantum resources is concentrated, is the evaluation of the (NR) parameter update.Ensuring the parameter estimates remain within the basin of convergence of the cost function is crucial, and it is the bottleneck in terms of the cost of our algorithm.As shown in Section 4, the size of the basin of convergence depends on the convexity of the cost function at the minimum (11).Overparametrizations (local or global) of the cost function are especially disrupting, as they produce singular Hessians (m = 0) even at optimal points.In this work, we proposed to use regu-larization by Hessian augmentation and back-tracking to solve this problem; this technique is practical and, as shown numerically in Section 6.4, it can produce convergent results for systems with overparametrized cost functions.However, the success of these techniques may depend on the choice of their hyperparameters (α, β, µ, ρ in Algorithm 1), and their application makes the analytical study of the algorithm convergence harder.In this section, we suggest alternative approaches to regularization and resource allocation for the algorithm, which would require further study.
Quantum natural gradients -Quantum natural gradient (QNG) descent is a recently-proposed parameter update technique [68,69], which takes into account the geometry of the ground state manifold.Natural gradient techniques, already long in use in classical machine learning [70], are invariant with respect to reparametrizations of the cost function; more importantly for our case, they nullify the effect of overparametrizations [71].The idea of QNG is to transform gradients with respect to the ansatz parameters into gradients with respect to Quantum Information Geometry.This reparametrization is achieved through the Fubini-Study metric tensor g( θt ) (to be evaluated at each update).A gradient descent step would then become: where η is the learning rate and g + ( θt ) is the pseudoinverse of the metric tensor.Here G(t + ∆t, θt ) is just the usual gradient of the energy as in Algorithm 2.
The resulting QNG step updates the ansatz by a fixed amount in the norm induced by the distance between quantum states, instead of a parameter-space norm, solving the issues connected to overparametrization.Another option would be to use classical natural gradients (NG) [70] defined on the 1-and 2-RDM manifold, which rely on the classical Fisher information matrix.
Adaptive step selection -Choosing a sufficient number of steps N to discretize C is key to the success of our algorithm, as proven in Section 4 and shown in Fig. 1 (bottom right).This is because the minimum θ * t , along with its basin of convergence, changes between subsequent steps by an amount proportional to the step size ∆t = 1/N .In Algorithm 2, we propose to linearly discretize a given parametrization of C for the sake of simplicity.An adaptive choice of ∆t could greatly reduce the cost of the algorithm, letting the steps be larger in the regions where the ground state changes the least, while concentrating more points in the regions where the ground state character sharply shifts.An adaptive step selection technique that preserves the provable convergence could be easily implemented if the gradient and Hessian of the cost function are measured through the 1-and 2-electron RDM, as per Eq.(39).The derivatives with respect to θ are then calculated by chain rule from the deriva-tives of the RDMs, at the current parameter value of θ = θt .This allows to compute energy E(t ′ , θt ), gradient G(t ′ , θt ) and Hessian H(t ′ , θt ) for any value t ′ , without further evaluations of the PQC.The step size can be then chosen as the maximum ∆t such that the Hessian convexity m(t + ∆t, θt ) remains above some positive threshold value.Further research could quantify the improvement that adaptive step selection would bring to our algorithm, and identify a method to implement this for optimized energy derivatives sampling techniques, such as those using double factorization [65].

Potential applications
The algorithm we propose resolves the Berry phase along a given path C; the description of the loop is an input of the algorithm.This loop construction will depend on the details of the considered application, and might involve chemical intuition and the consideration symmetries of the molecule where present.In realistic applications, we conceive our algorithm as a tool that can help to (1) certify CIs proposed by other methods, (2) determine whether a CI plays a role in a certain reaction, and/or (3) locate a point of the CI manifold in parameter space.In either case, an initial proposal of a path C that might contain the CI is necessary.
The case 1 is the most direct application of our algorithm.The loop C is chosen to surround a quasidegeneracy previously identified by an approximate classical method.The result of our algorithm could then confirm or disprove the presence of the CI.In case 2, given a photochemical reaction whose geometry is approximately known, we can use our algorithm on a set of loops to understand wether a CI plays a role in the reaction.These loops can be constructed by variations of the reaction path along perpendicular coordinates, focusing on the modes that influence the orbitals involved in the reaction, thus greatly reducing the search space for the CI.Finally, to locate the CI (case 3) various search approaches can be considered.For example, starting from a large loop C that is known to contain the CI, binary-search can be used to iteratively shrink the loop and locate the CI to the desired precision.The considered loops could be defined on a plane, if there is heuristic information about the direction along which the potential energy surfaces split.In alternative, a mesh of orthogonal loops in a subspace of the nuclear configuration space R can be tested.This, in combination with the data about the ground state energy collected by running the optimization in our algorithm, could also be used to determine the approximate location of the minimal energy crossing point [i.e. the point R MECP on the CI manifold with minimum Further work is needed to explore these problems, develop procedures to solve them and define and test practical application cases in the three categories.

Outlook
The bounds presented in Section 4 have been calculated to provide a guarantee of convergence for our method, which is an atypical feature for variational quantum algorithms.These are not supposed to be tight bounds or resource estimates for a realistic application of our algorithm.Further research is needed to define better bounds.This, along with the choice of a specific method to extract the energy and its derivatives (e.g.RDM sampling and parameter shift rule) could allow estimating the cost of a practical application of this algorithm.Furthermore, a study of the errors due to the ansatz not perfectly reproducing the ground state, and those induced by circuit noise, could help to understand the practical limitations of the algorithm.
The computation of Berry phases is also central when characterizing topological phases of matter [72].For the specific case of non-interacting Hamiltonians with chiral or inversion symmetry, the winding number is analytically computed through the Zak phase [73], which is related to the Berry phase that is accumulated after a closed loop through the Brillouin Zone.For interacting systems, in which one cannot access momentum space, there is a mechanism in which one introduces an external periodic perturbation to the Hamiltonian [74,75].As long as the perturbation does not close the gap and respects the symmetries of the system, the Berry phase can be computed by considering a closed loop in parameter space, similar to what is proposed in this work.As an outlook, one could consider extending the VQE approach to detect topological phases of matter through the computation of the Zak phase.Furthermore, VQA approaches to other topological invariants, such as the Chern number [76] could be considered (possibly inspired by methods related to their experimental detection, such as Thouless pumping [77]).
Finally, classical algorithms to resolve conical intersections from Berry phases inspired by this approach could be designed.If the approximate ground state is represented by a variational classical ansatz that fixes the U (1) gauge, a simple extension of our method could be achievable.For example, this could be achieved with an extension of CASSCF (essentially a CASCI solver on top of an orbital optimization) that implements continuous local optimization of the SCF matrix, to allow enforcing the smoothness constrains that are crucial to keep the gauge of the state fixed.
We first define the 1-and 2-electron reduced density matrix (RDM) in the spin-restricted formalism Γ pqrs (θ) = ⟨ψ(θ)|e pqrs |ψ(θ)⟩, where E pq = σ c † pσ c qσ and e pqrs = στ c † pσ c † rτ c sτ c qσ = E pq E rs − δ qr E ps .Here, p, q, r, s are meant to be general indices (either occupied, active or virtual), where the state |ψ(θ)⟩ is to be intended as padded by two registers of qubits in the |0⟩ ⊗2N V (|1⟩ ⊗2N O ) state for the virtual (occupied) orbitals.In the molecular orbital basis defined by C [orbitals in Eq. (32)], we can write the Hamiltonian as where h pq and g pqrs are the one-and two-electron integrals (with spatial orbital indices p, q, r, s ordered according to the chemists' convention), and they implicitly depend on C through the MOs.The expectation value of the Hamiltonian can then be written as a contraction of the integrals with the RDM, where we made explicit the dependence on C.
To derive analytical orbital rotation derivatives, we closely follow Ref. [29].We start by separating the dependence on κ of the reparametrized cost function E(C • e −κ , θ), by using the equivalent state transformation formalism provided by Thouless theorem [78] where κ = pq κ pq E pq is the operator that generates a unitary on the Hilbert state space equivalent to the orbital rotation.We know that the rotations where p, q are both virtual indices form a redundant subgroup, so we can freeze the corresponding κ pq = 0; the same is true for p, q both core space indices.In other terms, κ pq = 0 if p, q ∈ V or p, q ∈ O, with V and O the sets of virtual and core indices.The remaining elements of κ pq satisfy κ pq = −κ qp .We define a vector of unique non-redundant orbital rotation parameters and we redefine the cost function with respect to this vector, We are interested in the derivative with respect to this vector; we can always switch from the matrix κ to the unraveled vector of unique non-redundant parameters κ, and vice versa.By comparing the Taylor series in κ with the Baker-Campbell-Hausdorff expansion: One can readily verify that the analytical orbital derivatives at κ pq = 0 are given by: where P pq,rs permutes the pair of indices pq with rs.The calculation of the commutators in Eq. ( 76) and (77) can be found in common quantum chemistry textbooks [29], and they all one-or two-body operators; thus their expectation value can be written as a linear form in the RDM (γ, Γ).The gradient evaluates to where F is the generalized Fock matrix, The Hessian evaluates to where we introduced Y pqrs = mn Γ pmrn g qmns + Γ pmnr g qmns + Γ prmn g qsmn (81) and dropped the explicit dependence on θ.For the composite Hessian, we simply take the gradient of Eq. ( 78) with respect to θ, by using the chain rule where Thus, once we have the derivatives of the 1-and 2-RDM to sufficient precision, we can evaluate the orbital gradient, Hessian and composite Hessian analytically, recovering all terms in Eq. (37) and Eq.(38) without any additional quantum cost.

D Bounding the cumulative error due to Newton-Raphson updates
In this section, we prove that using a single Newton-Raphson (NR) parameter update per ∆t-step is sufficient to achieve an error on the estimate of the minimizer scaling as O(∆t 2 ) after any number of ∆t-steps, as long as the cost function is strongly-convex at the minimum and ∆t is small enough.First, we recall sufficient conditions for quadratic convergence of the Newton-Raphson step.We then use these to bound the error of a single NR-step when the cost function is changed from E(t, θ) → E(t + ∆t, θ).We translate this into an upper bound on ∆t which guarantees the error stays bounded throughout the optimization path.Finally, we show that we can allow a sufficiently small additive error on the Newton-Raphson update and retain the bounded error throughout the path.Quadratic convergence of NR -Consider a cost function E(θ) with gradient G j = ∂E ∂θj and Hessian H jk = ∂E ∂θj ∂θ k , and an initial guess of a minimizer θ (0) .The Newton-Raphson step prescribes the update θ (0) → θ NR = θ (0) + dθ NR with dθ NR = H −1 (θ (0) )G(θ (0) ).Theorem 3.5 from Nocedal and Wright [44] gives sufficient conditions under which quadratic convergence of the NR update is guaranteed.We simplify these conditions, and obtain the following Given an initial guess θ (0) which is close enough to the minimum, i.e.
the NR update will converge quadratically towards the minimum with To prove this, we only need to show that a strong convexity condition is satisfied within a r-ball centered in θ * including all close-enough possible initial guesses (r = m 4L ), i.e.
Single NR update with a changing cost function -We now consider a family of cost functions E(t, θ) continuously parameterized by t.Suppose we have an approximation θt of the minimizer θ * and that the gradient of the change is never larger than Ġmax .(while the first assumption imposes the nontrivial condition of strong convexity at the minimum, the second is always granted for cost functions from a continuous family of bounded Hamiltonians).We can then write ∥θ (0) To ensure this initial guess is within the quadratic convergence region of the NR step, we require choosing an α, β ∈ (0, 1] this condition can be written as This allows to apply Theorem 1, and bound the error after a single NR step, Multiple steps -We want to ensure the error on the minimizer estimate remains bounded for t taking subsequent values is [0, ∆t, 2 ∆t, ..., 1], while taking a single NR step at a time.We can do this by imposing the error after each step [Eq.(99)] is not larger than the error on the previous step estimate, If we define γ by σ θ = γβm 4L , we can write which is saturated by β = 4 1−α (1+γ) 2 .We then get which is maximised (while keeping β ≤ 1) by α = max[ 1 2 , 1 − (1+γ) 2

E Bounding the sampling cost
We call σ 2 G and σ 2 H the variances of each element of the gradient and Hessian respectively, due to sampling noise.To compute the error on the parameter updates, we propagate these variances through the definition of the NR update Eq. (10).The first-order differential change (here denoted with δ) of the NR update dθ NR with respect to changes in the gradient and Hessian is where we use the ordered matrix-product notation, with vectors in boldface.When δG and δH are the random variables representing the errors on the gradient and Hessian, the expected mean square error on the NR update defining the norm of the covariance matrix where we used the zero-average property of δG and δH to drop the expectation values of mixed terms.This can further be bounded as Assuming the same variance σ 2 G on each of the n p elements G j of the gradient, we get As the Hessian is a random real symmetric matrix with i.i.d.elements, each with a variance σ H , we can invoke Wigner's semicircle law [45] to bound the spectral norm as Combining these with the strong convexity bound ∥H −1 ∥ ≤ m −1 , we get The calculations in Appendix D conclude that, for each ∆t-step, we can afford an additive error on the NR update of at most σ θ ≤ γ

Figure 1 :
Figure 1: (a) Three loops in the nuclear configuration space of formaldimine; C1 (green), C× (red) and C2 (blue).C× encircles a CI, resulting in a non-trivial Berry phase.In this representation, the loops are discretized by N = 25 points.The color plot indicates the energy gap at the full configuration interaction (FCI) level.(b) Energy and (c) change in PQC parameter around the three loops, with the same color coding as in (a) and the same N = 25.These results refer to a (2,2) active space, and a minimal (STO-3G) basis set description of Formaldimine, with a OO-UCCD ansatz that has a single θ parametrizing the only double excitation.The continuous lines show the true optimum θ * t and the relative energy (obtained by full optimization), while the markers show the progress of the estimate θt from Algorithm 2, in absence of sampling noise.The Hessian stays positive throughout the path, no regularization is needed.(e) Final overlap computed by Algorithm 2 for the red loop containing a CI, for a varying number of total discretization points N .For N < 9, the Hessian is not always positive and regularization is needed to invert the Hessian, but no backtracking is used.(d) Schematic representation of Formaldimine, indicating the parameters used to define the nuclear geometries in this work.

Figure 2 :
Figure 2: Energies throughout the loops in Fig. 1 (discretized by N = 25 steps), in the presence of sampling noise.Each element of the composite gradient Eq. (37) and Hessian Eq. (38) are perturbed by random gaussian noise with variance σ 2 = 5 × 10 −6 .In this instance the Hessian stays positive around the path, so no regularization is needed.The full lines, like in Fig. 1b, indicate the true optimum E(t, θ * t ) obtained by full optimization.

Figure 3 :
Figure 3: Success probability Psucces of the algorithm as a function of the number of discretization points N and the sampling noise variance σ 2 for the loop C× containing a CI.The details of the model and geometry of the loop C× match Fig. 1 (red loop).Success is defined by resolving a final overlap ⟨ψ( θ0)|ψ( θ1)⟩ < 0, which returns ΠC × = π.The success probability is computed over 100 simulated runs.A Psuccess ≈ 50% indicates the algorithm returns random outcomes.Regularization by Hessian augmentation is enabled in these calculations, while no back-tracking is used.

Figure 4 :
Figure 4: Energies throughout two loops, for a more challenging model of Formaldimine, described in the cc-pVDZ basis set and with a (4,4) active space.The (red) blue loop indicates a (non-)trivial Berry phase, centered around (α = 113 • ) α = 145 • and ϕ = 90 • (both), with a radius of 15 • , and discretized by N = 50 points (note these loops are different from Fig.1).The basis set is too large to run a FCI calculation to locate the CI, and using another approximate method (e.g.state-average CASSCF) might bias the CI location.Instead, we manually choose larger loop geometries and use Algorithm 2 to find the value of ΠC.

4 mL√ 2 − 1 .
with γ = Comparing this result to the variance just calculated, we can formulate the requirement on the variance m −2 n p σ t of E(t, θ), with error ∥ θt − θ * t ∥.In each step of our method (Algorithm 2), we shift the cost function E(t, θ) → E(t + ∆t, θ) by ∆t and we use the current minimizer estimate θt as initial guess for the next step; θ While the first term is the (given) error on the estimate, the second can be obtained by taking the total t-derivative of the minimum condition G(t, θ * (t)) = 0, and applying Taylor's theorem ∃τ ∈ [t, t + ∆t] : ∥θ * t+∆t − θ * t ∥ = We now assume that the convexity at the minimum is bounded from below throughout the whole t-path by a constant m ≥ 0, m(t, θ * t ) := ∥H −1 (t, θ * t )∥ −1 > m ∀t ∈ [0, 1], 2 G + σ 2 H ∥dθ NR ∥ 2 ≪ L 2 n p dθ NR 2 < γ 2 16 m 6 L 2 n p ∥ Ġmax ∥ 2 ∆t 2 (115)To recast this bound in terms of variables of the problem, we use the following relations derived in Appendix B: L = max∥T ∥ < n ∥H∥ (the norm of the third derivative tensor T is bounded by n Ġmax ∥∆t < √ n p ∥ dH dt ∥∆t ≈ √ n p ∥H∥ (same infinity norm bound).Furthermore, we assume the convexity is larger than the ground state gap, m > ∆; this holds if the ansatz approximates the ground state well enough, and changes in any ansatz parameter θ k introduce a different excited state.Substituting these relations we obtainThe number of total required shots to sample the Hessian (gradient) for all the N steps will thus scale as σ −2 H ∆t −1 (σ −2 G ∆t −1 ).Picking the maximal ∆t = m 2