Connecting geometry and performance of two-qubit parameterized quantum circuits

Parameterized quantum circuits (PQCs) are a central component of many variational quantum algorithms, yet there is a lack of understanding of how their parameterization impacts algorithm performance. We initiate this discussion by using principal bundles to geometrically characterize two-qubit PQCs. On the base manifold, we use the Mannoury-Fubini-Study metric to find a simple equation relating the Ricci scalar (geometry) and concurrence (entanglement). By calculating the Ricci scalar during a variational quantum eigensolver (VQE) optimization process, this offers us a new perspective to how and why Quantum Natural Gradient outperforms the standard gradient descent. We argue that the key to the Quantum Natural Gradient's superior performance is its ability to find regions of high negative curvature early in the optimization process. These regions of high negative curvature appear to be important in accelerating the optimization process.


Introduction
Parameterized quantum circuits (PQCs) are central components of variational quantum algorithms, a class of algorithms well-suited for being implemented on near-term quantum devices.In these algorithms, parameters of PQCs are tuned or optimized, using a classical computer, to prepare quantum states on a quantum computer that encode solutions of a problem, e.g. the ground state of a quantum system or a target probability distribution [1,2].While PQCs can be constructed leveraging physical insights, e.g.unitary coupled-cluster [3][4][5], in applications such as quantum machine learning or implementation of proof-of-principle experimental demonstrations of variational quantum algorithms, PQCs follow a more heuristic design.These PQCs comprise repeated layers of a particular low-depth configuration of single-qubit and two-qubit gate operations [6,7].Despite the rapid developments in variational quantum algorithms, PQCs are not yet well understood nor effectively designed.For instance, while "hardware-efficient" circuits may correspond to very low depths, it was shown that many of the parameters (and their corresponding gates) are often unnecessary or redundant [8][9][10].In our work, to develop a more concrete understanding of the role of parameters in PQCs, we begin with a mathematical description of PQCs.
Formally, parameterized quantum circuits are maps Ψ(θ) between a set of continuous parameters θ and the output statistics of a set of observables on a given system.In recent

Cost function landscape
Algorithm Performance Expressibility Geometry (of two qubits)

*Our work
Figure 1: Progress on the study of parameterized quantum circuits (PQCs) and their connections to the algorithm performance.Past works have investigated how use of particular circuits leads to features in the cost function landscape that hinder algorithm performance (e.g.barren plateaus).Other works have introduced circuit metrics such as expressibility and started correlating them to algorithm performance.Our work, while limited to two qubits, initiates the discussion to connect the geometry of PQCs to the algorithm performance and provides valuable insights into why and how Quantum Natural Gradient accelerates the optimization step of near-term quantum algorithms.years, significant progress has been made to better understand PQCs and their connection to the algorithm performance as illustrated in Fig. 1.For instance, Ref. [11] introduced the phenomenon of "barren plateaus" or regions of non-informative gradients in cost function landscapes resulting from employing PQCs that form approximate 2-designs.Several works extended this work, connecting particular circuit structures to cost function landscape features such as barren plateaus and narrow gorges [12] that hinder algorithm performance.For a better understanding and evaluation of PQCs, Ref. [13] introduced "expressibility" as a quantity to compare among PQCs and rule out circuits with limited capabilities.Since then, works such as Ref. [14] have started correlating expressibility to performance metrics of particular variational quantum algorithms.Additionally, Ref. [15] connected high expressibility to the presence of barren plateaus in cost function landscapes.
Among past works connecting PQCs to algorithm performance, Ref. [16] introduced and investigated the use of Quantum Natural Gradient (QNG) descent to accelerate the optimization of parameters making connections to imaginary time evolution.This modification of the gradient descent which involves an inverse of a metric that is one fourth of the Quantum Fisher Information.The Fubini-Study metric tensor employed in QNG does not contain information about the objective function (unlike gradients or Hessian) and yet has been shown to significantly accelerate optimization by accounting for the geometry of the wave function.Since its discovery, the QNG has been a subject of additional study.For example, while it might suffer from the same barren plateau [11] problem, there is numerical evidence that at least for shallow circuits the variance of the cost function is orders of magnitude bigger than if vanilla gradient descent [17] was used.In [18], methods for using the QNG and adaptively choosing learning rates that can derived from the QNG were shown to outperform traditional methods like ADAM and LBFGS in learning quantum states.These performance gains come with an overhead cost of calculating what the QNG is, but work has been done showing that for quantum simulation there exist efficient algorithms to calculate it [19] and even in the setting where one is required to calculate it from measurements coming from quantum hardware the overall cost is asymptotically negligible in the number of iterations and qubits [20].The QNG has also been extended to the non-unitary case where some depolarizing noise is allowed in the circuit [21].To gain a deeper understanding of the connection between the geometry of the wave function and algorithm performance, we initiate the discussion by providing a geometric characterization of two-qubit PQCs.
The structure of the paper is as follows: (i) In Sec. 2, we introduce the geometrical formalism we shall use to analyze PQCs.We consider four specific two-qubit PQCs and take some time to carefully geometrically characterize the circuits.
(ii) In Sec. 3, we introduce the notions of concurrence and the scalar curvature and uncover a simple and remarkable relationship between the two concepts.
(iii) In Sec. 4, we capitalize on the relationship between concurrence and curvature in order to provide geometrical insights to the VQE optimization process in the PQCs considered in this work.
(iv) In Sec. 5, we summarize our work, providing future directions of research and open questions.
2 Geometry, quantum mechanics, and parameterized quantum circuits A geometric reformulation of quantum mechanics has largely been a mathematical and theoretical pursuit while the algebraic view of quantum mechanics appears to be more practical for the everyday quantum mechanician.The question that may arise is whether the geometric approach remains a mere mathematical happenstance or can be leveraged in more practical day-to-day settings.For completeness sake, we provide a quick introduction to the geometric structure of quantum mechanics and hopefully lay bare the complications that arise for the projective nature of quantum states.The discussion will hopefully explain more clearly the geometric connection and what goes wrong if it is not fully incorporated in one's understanding of parameterized quantum circuits.The Kähler structure of quantum mechanics furnishes us with a Riemannian metric and symplectic structure encompassed in what is called the Quantum Geometric Tensor (for more in-depth pedagogical introductions to this large and rich area, we refer the readers to [22,23]).

Quick introduction to fiber bundles
Most topological spaces have a complicated geometry for which there most likely is no easy intuitive picture.One way of addressing this problem is instead to think about the local geometric structure of this complicated topological space.The local geometry will be thought about as being simply a Cartesian product of lower dimensional geometries.For the rest of the paper, we are going to assume that the complicated topological space E can be endowed with a manifold structure.The idea of a fiber bundle is to imagine the geometry of E as locally being a Cartesian product of two lower dimensional manifolds B and F. If in fact the total space E can be thought of as merely B × F then the fiber bundle is called a trivial bundle.Otherwise, in general, the local Cartesian structure will be patched together with some twisting that will make the global geometry look very different from the local geometry.The local picture of E can be thought of as there being a base space, B (which we go to by a projection map π from the total space, E) and then at each point on the base space attaching the second geometry called the fiber, F. We denote the fiber bundle as  For quantum mechanics, we are interested in fiber bundle for which we have an action of a group G. How does the group act on our fiber bundle?For all g ∈ G, and for any point b ∈ B , bg is merely another point in the fiber attached at the base point b.We demand we have continuous right action of the group so that in fact G is a lie group and that this action is free and transitive i.e. that respectively only the identity element preserves a point in the fiber, F and that for any two points, f, f in the fiber there is a unique group element g such that f = f g.What this amounts to is that the group G is in fact isomorphic to the fiber F. When this happens the fiber bundle is called a Principal Fiber Bundle.This principal fiber bundle is then denoted as

Quantum mechanics and geometry
We are often taught early that global phases do not matter.While this is usually peppered over since these phases have no physical consequences, it has major consequences from our point of view.Consider a wave function in a Hilbert space, |ψ ∈ H.The fact that the global phase does not physically matter means that we are instead considering equivalence classes denoted as [|ψ ].This means that the physical space is in fact a projectivized Hilbert space P(H).Note that we can also think of this as a principal fiber bundle where the group action is given by U (1) so that the principal fiber bundle we have is On the way to introducing a metric on the Hilbert Space H, we recall that we have a Hermitian inner product, h : where G(φ, ψ) = Re( φ|ψ ) and F (φ, ψ) = Im( φ|ψ ).The real part is what produces a Riemannian metric and the imaginary part is what gives us a symplectic structure.To arrive at the metric we shall need to consider the tangent space of P(H).The tangent space at point on P(H), T [|ψ ] P(H) is the quotient vector space T [|ψ ] H/ H where the equivalence is |φ 1 ∼ |φ 2 whenever |φ 1 − |φ 2 = a |ψ where a ∈ C. By choosing an orthogonal complement to the space spanned by |ψ i.e ψ|φ = 0, we don't have an ambiguity in the tangent space.To ensure that this condition is always met, we use the projection to the orthogonal complement: On the tangent space remembering T |ψ H H, we see that Eq. (4) may be written in the form This is in fact what is called the Quantum Geometric Tensor in the literature.To see this, consider the wave function |ψ , then we go to the tangent space with bases {|∂ i ψ } and then plugging this into (6) we get Remark: Note the intrinsic nature of the above derivation for the Quantum Geometric Tensor.Usually this is arrived at by considering the transition between a state |ψ u and infinitesimally close-by state |ψ u+δu then imposing 'gauge invariance' of quantum mechanics [24] or noting that the squared differential, ds 2 = g ij dx i dx j , can in quantum mechanics be written as ds 2 ∝ (1 − F 2 ) where F = | φ|ψ | [16] to lowest order in the parameters that appear in the wave function.The latter derivations, [24] and [16], while correct obscure its intimate origins to the projective nature of Quantum mechanics.The real part of the Quantum Geometric Tensor in (6) is the Fubini-Study metric.

Fiber bundles and PQCs
A bit of care must be taken in applying the information geometric view to PQCs.In order to see why, we carefully define what a PQC is.To do this we follow Ref. [25] and define a variational ansatz or PQCs as follows: Definition 1.A parameterized quantum circuit is the image of the following map Ψ : To get at the metric we go to the tangent space and as a consequence define the following push-forward map: where v = v i ∂ θ i .This allows us to define the metric on P(H) using our points in R m and this is done by using the pull-back by Ψ, back to T p R m i.e.
where we define the value of the metric as Here, we have suppressed the dependence on θ for ease of notation; we have implicitly chosen a point on a ray in evaluating the value for the metric.Now the map Ψ need not be injective so that in fact the metric in Eq. ( 11) is, in general, degenerate.This is in fact what happens with the circuits we study in this work.As a consequence, no unique inverse metric exists.The problem derives from the fact that the parameters in the wave function are not chosen with the projective nature of quantum mechanics and thus when we ask physical questions like expectation values of hermitian operators we will in fact live on some constrained surface in P(H); in other words our tangent space at some point will in general be of smaller dimension than T p R m .
Naively, we have the following Quantum Geometric Tensor for our PQC:

Two-qubit parameterized quantum circuits and geometry
In this section we study how the PQCs fit in the geometry of P(H) and find the constrained surfaces they live on.This exercise elucidates what part of the geometry the PQCs have access to.
There will be four major circuits we shall consider in studying the information-geometric view of PQCs, namely: 1. the Hardware-Efficient Ansatz (HEA) introduced by [7] and studied in the context of the Fubini-Study metric by [26], 2. the Low Depth Circuit Ansatz (LDCA) introduced in [27], 3. an ansatz introduced in the context of quantum generative adversarial networks (QGANs) [28], and 4. an ansatz used in [29] composed of gates native to the Sycamore chip [30], e.g. the fermionic simulation (fSim) operation.We will refer to this ansatz as the Sycamore HEA (sHEA).
In general, we have the following isomorphisms where P(H n+1 ) is the projectivized version of an n + 1 dimensional Hilbert space, CP n is the complex projective space and G 1,n+1 is the Grassmanian of dimension n.For two qubits, the string of isomorphisms in Eq. (13) become specifically the following isomorphisms: Luckily, for two and incidentally for three qubits, we have a fiber bundle picture namely the Hopf Fibration.For two qubits, the Hopf Fibration can be thought of as an SU (2) principal fiber bundle.We now concentrate on S 7 and think of it having a Hopf Fibration.It is a well-known fact that the fibration has the following character: The fiber S 3 is isomorphic to SU (2).From a quantum mechanical point of view S 4 ,which can locally be split into two spheres, i.e S 2 × S 2 , can be given the following interpretation: the first sphere is a quasi-Bloch sphere representing the degrees of freedom one observer has access to for a given two-qubit entangled state and the second sphere parameterizes the amount of entanglement shared by the two qubits [31].
For each of the major circuits, we calculate the version of the metric that is in general degenerate, a metric that is specifically tailored to the geometry of S 7 and provide explicit parametrizations of how the two qubit circuits sit inside the geometry.
The Hopf base and fiber parameters are calculated as follows [31,32].For a 2-qubit state: where α, β, γ, δ ∈ C, compute the following: 1. First, calculate the S 4 Hopf base parameters (θ A , φ A ) and (χ, ξ) and the equivalent Cartesian coordinates (x 0 , x 1 , x 2 , x 3 , x 4 ) as follows: (a) In calculating how the PQCs fit inside the fibers using intrinsic co-ordinates, one needs to solve highly non-linear trigonometric equations.We simplify our task, by switching coordinates to the constrained extrinsic co-ordinates.We take advantage of the following parameterization equivalence: where q is a unit quaternion used to parameterize the fiber and e tφ A is a unit quaternion used to parameterize a point on the entanglement sphere.
2. Next, calculate the S 3 Hopf fiber parameters, which are represented by the quaternion q.
(a  We compute the above Hopf base and fiber parameters for the circuit blocks in Fig. 3 and present our results in Appendix B.

Local geometric expressibility
The notion of expressibility for PQCs has been explored in [13].By thinking of the Hilbert space from the geometric point of view, namely as S 7 , we can arrive at a picture of local geometric expressibility.From this perspective we ask ourselves which points in S 7 can our PQCs reach, and which constrained surfaces do they live on in terms of some un-constrained co-ordinates.As discussed earlier, the local degrees of freedom accessible to an observer are parameterized by one of the spheres with intrinsic coordinates (θ A , φ A ) on the base manifold while the entanglement properties are parameterized by the second sphere parameterized by intrinsic coordinates (χ, ξ).The sHEA circuit covers the largest surface area of points, while we see that both HEA and LDCA cover points on a two sphere.Interestingly although both HEA and LDCA cover points constrained to live on S 2 , LDCA uses one of its co-ordinates to explore the entanglement sphere.This, in principle, allows LDCA to explore more sub-manifolds of different entanglement in S 4 .
Next we consider how the different circuits differ in the fiber space.Points in the fiber space appear from one observer's point of view as a difference in the gauge i.e they do not change the entanglement properties on the quantum state (This can be made more explicit by introducing a connection).To help abstract out the minutiae we first re-express (65), ( 79), (92) that represent the parameterization in the fiber space in a way that makes physical interpretation easier.For a fixed fiber, these functions will in general depend on a subset of the parameters in the quantum circuit as one moves in that fiber.Using the algebraic equivalence between quaternions and the Lie algebra of SU (2) namely: We re-write (65), ( 79), (92) respectively as follows: In the equations above only the θ i s shown are those that can vary in a specific fiber on a point in the base manifold.Expressions for a − g, m, n, s can be found in Appendix B.

Quantum Circuit
Fiber Space

Quantum Natural Gradient descent
We have discussed the possibility of the metric derived for directly applying Eq. ( 12) being generally degenerate.In this section, we calculate the "Fubini-Study" metrics (by following Eq.( 12)) and see that most are indeed degenerate.Although this is the case, it has been shown in [16] that these degenerate matrices, in practice, lead to improvements in the number of iterations for the optimization process, through approximating to a block-diagonal or diagonal form, adding numerically small values to make matrices invertible, and/or taking the pseudoinverse.A more recent study, however, notes that (block-)diagonal approximations to the metric tensor may not be necessary as the cost of computing the metric tensor is asymptotically negligible [20].Using the standard gradient descent method, the parameter update rule for the parameters in the quantum circuit is: where η is the step size or learning rate and L(θ) is the objective function to be minimized.
For the four circuit types considered, we have the following Fubini-Study metrics assuming (12) and naively calculating them.Matrix elements that are included in the block-diagonal approximation to the metric tensor are in blue.
where for g sHEA we have Note: The notation used above O i is used to mean the expectation value of the operator O with respect to the wave function at the i th layer of the circuit, while j O i = ψ j | O |ψ i and |ψ i is the wave function at the i th layer.
Of the four metrics, ones for the QGAN circuit and sHEA are non-degenerate.From the previous section, we can see why this is true; we have 3 independent parameters in the base space, 3 parameters in the fiber space but we also have a global constraint that all the amplitudes must add to 1 so that we have a total of 5 independent global parameters.This matches the number of parameters in the quantum circuit.The same can be seen with the sHEA geometry where we get an extra degree of independence from the base space.
In practice, g may be singular due to redundant parameterization, or inverting g becomes computationally challenging with increasing number of parameters.Thus, in the case of natural gradient descent for training classical neural networks with many parameters, the inverse of the Fisher Information Matrix (FIM) has been approximated as an inverse of a block-diagonal matrix (and further approximated as Kronecker products of inverses of smaller matrices) to reduce the computational overhead [33].The FIM of deep linear networks are also often singular due to parameter redundancies.Thus, generalized inverses are used.Ref. [34] showed that for deep linear networks, any choice of generalized inverse was effective in accelerating natural gradient descent.

Concurrence and the Ricci Scalar
Unlike its classical analogue, the Quantum Natural Gradient has no obvious connection to the geometry encountered in the optimization process.One might have considered calculating the curvature from the Fubini-Study metric derived from the real part of the Quantum Geometric Tensor, but as has been previously argued, one does not get a legitimate metric since the metric used in Quantum Natural Gradient is in general degenerate.As a consequence, one could consider regularizing the metric either by considering the pseudo-inverse or by adding a small constant multiplied by the identity matrix ( I), a kind of Tikhonov regularization.The downside to this path is that the value of the curvature depends on the regularization procedure and in fact for the Tikhonov regularization the curvature in some cases is ill-defined since the value at a point depends on how this small constant is brought to zero.Nevertheless, in practice, carefully setting regularization has been shown to reduce the measurement costs of QNG [20].
What we opt for therefore, is a rather indirect way of seeing the Quantum Natural Gradient geometrically at work.We will calculate another metric, a quaternionic Fubini-Study metric that can be placed on the base manifold of two qubits, S 4 .Amazingly enough this metric connects the concurrence (entanglement) of the quantum circuit and the curvature of the base manifold.

Concurrence
In the case of two qubits, the entanglement entropy is the unique measure of entanglement.In work by Hill and Wootters [35], it was noted that the entanglement entropy can be thought of as a function of the concurrence which is defined in the following manner: where |ψ = α |00 +β |01 +γ |10 +δ |11 .The concurrence has been measured in experiments [36,37].Extensions to mixed density matrices can be considered by calculating the convex roof extension, but in this work we stick to pure states when calculating the measure.We can also formulate the notion of concurrence within the geometric picture thus far outlined.First note that the concurrence can be written in terms of the extrinsic co-ordinates of S 4 as: The concurrences for the four circuits in terms of their corresponding circuit parameters are: + sin (θ 3 ) − sin (θ 1 ) sin (θ 2 ) sin

Ricci Scalar
In Riemannian geometry, the scalar curvature, also known as the Ricci scalar, is an invariant that characterizes the curvature of a Riemannian manifold.It may be defined in terms of the metric tensor g as follows: Let g ab denote the components of g, and let g ab denote the components of its inverse g −1 .The scalar curvature is defined as where are the Christoffel symbols of the first kind.In the equations above, note that we have used Einstein's summation convention and that the commas in the subscripts indicate a partial derivative: for example, Γ c ab,d = ∂ d Γ c ab .For more details on the scalar curvature, we refer the reader to [38].
Apart from its importance in areas in differential geometry and in cosmology, the Ricci scalar is slowly finding applications in hitherto unexpected places for example the study of phase-transitions in quantum many-body systems [39][40][41] and in classical machine learning, a formulation of neural nets in the context of Riemannian Geometry has been explored [42] while specific use of Riemannian curvature has been explored in [43,44].
Using the isomorphism HP 1 S 4 where HP 1 is the quaternionic projective space we can incorporate the concurrence as part of the calculation of the Mannoury-Fubini-Study metric [32] as follows: where w = |C|e iχ (defined in Sec.2.3), and 0 < Φ ≤ 2π, 0 < Θ ≤ π are the usual polar co-ordinates for B 3 (three dimensional ball).We may make an amusing observation that this is the same metric as the Euclidean Schwarzshild metric for fixed time.
The metrics for the four circuits are therefore where: for the QGAN ansatz and where for the sHEA ansatz.
The scalar curvatures R A (see Eq. ( 35)) of all the PQC metrics (Eqs.( 38)-( 41)) g A (for A = HEA, LDCA, QGAN, sHEA) are calculated to be: Note that the Ricci scalar can be calculated by just knowing the metric.Since the metric in our case is just a function of the concurrence of a general 2-qubit circuit, Eq. ( 45) is completely general for the two-qubit case.We note that the formula is a simple rational function of the concurrence, and from the formula, we can see that singularities on the base manifold correspond to maximally entangled states with concurrence value of one.
The scalar curvature of each of the circuit written in terms of their circuit parameters are: These scalar curvatures are additionally plotted in Fig. 5.For HEA and LDCA, their Ricci scalars are each a function of two circuit parameters, i.e.only two parameters influence the entanglement in the system.On the other hand, the Ricci scalars of QGAN and sHEA are functions of three and four parameters, respectively.To visualize how parameter values impact the curvature, values of θ 1 and θ 2 are scanned over range [0, 2π] while values of other parameters, if they appear in the expression of the Ricci scalar, are fixed at particular values.
With QGAN, by tuning the value of θ 5 from 0 to π 2 , we observe a gradual emergence of "wells" of negative curvature.With sHEA, the landscape depends significantly on values of θ 3 and θ 4 .For instance, when θ 3 = π, there are wells of negative curvature similar to those from QGAN's landscape.However, when θ 3 is decreased to approximately π 2 , the wells are replaced by "valleys" of negative curvature.Comparing these landscapes provides some insight into the (relative) entangling capabilities of the circuit blocks and the ease of generating high entanglement states by tuning the circuit parameters.Circuits corresponding to curvature landscapes with extensive or large regions of negative curvatures (i.e.high concurrences) such as LDCA, are able to more readily generate states with high entanglement compared to circuits with landscapes with limited regions of negative curvature similar to that of QGAN.Additionally, it appears easier to generate high entanglement states with LDCA than to do so using circuits such as QGAN and sHEA, in which one must tune multiple circuit parameters, several of which need to be set near specific values to generate states with high entanglement.
Why might one be interested in calculating the Ricci scalar?After all, from the face of it, it seems to contain the same amount of information as the concurrence.The answer lies in the goal of this study, namely understanding PQCs from a geometric point of view and their optimization performance with regards to the Quantum Natural Gradient.We have the intuition that the Quantum Natural Gradient somehow incorporates the geometric information of the projective Hilbert space in the optimization procedure in order improve the speed of convergence.The question is that can we map out the geometry of the quantum circuits and see geometrically what the quantum natural gradient is doing differently?This is the question we take up in next section and explore.

Insights into VQE performance through scalar curvature: a two-qubit study
In this section, we numerically demonstrate the connection between the geometry and performance of two-qubit parameterized quantum circuits.To quantify the performance, we consider a toy problem instance of the Variational Quantum Eigensolver (VQE) algorithm [45] which, despite its simplicity, shows how the scalar curvature of a parameterized quantum circuit for two qubits may help inform the effectiveness of an ansatz prior to execution of the algorithm.Namely, the task-at-hand is to estimate the ground state energy of molecular hydrogen at a particular bond length, employing each of the two-qubit circuits.We consider the two-qubit Hamiltonian for molecular hydrogen from an early VQE experimental study [46]: where ν = {ν 1 , ν 2 , ..., ν 6 } corresponds to Hamiltonian coefficients.For this problem, the ground state wave function is of the form: , where the wave function coefficients α and β are functions of r H−H , the inter-atomic distance between the two Hydrogen atoms.At r H−H = 3.19 Å, the ground state wave function is a highly entangled state of the form: for α, β ∈ C where |α| 2 ≈ 0.47 and |β| 2 ≈ 0.53.In this case, the ability to generate (highly) entangled states using a parameterized quantum circuit is important for representing the solution state.We executed wave function simulations for the VQE calculations.For the optimization in VQE, we consider the standard gradient descent optimizer as well as the Quantum Natural Gradient optimizer using the block-diagonal and diagonal matrix approximations for the Fubini-Study metric tensor [16].We fix the step size of each type of gradient descent to 0.05 and terminate each optimization run based on a convergence threshold of 10 −6 .Using this toy problem, we discuss two main observations:1 1. Particular circuit structures lead to precise and/or accurate solutions.Using 50 independent optimizations, we show that circuits like LDCA initialize states at high concurrences and consistently and rapidly converge to the solution.Others have wider spread in the final energies and their accuracies.We argue that further insight about these results can be understood from the scalar curvature.
2. We investigate the role of the fiber space by considering the QGAN circuit as a case study.This circuit, as constructed, is unable to reach the ground state but is able to do so by appending local rotation gates.By calculating the curvature, we are also able to see how the Quantum Natural Gradient is able to take advantage of the geometry while the standard gradient descent does not.

Curvature landscape and optimization
We first investigate the connection between the curvature landscape and VQE optimization.Fig. 6 tracks the energy error, concurrence, and scalar curvature of optimization paths using the four circuit blocks.For each circuit, 50 independent optimization trials were performed using random parameter initialization.For three of the four circuits, namely HEA, LDCA, and sHEA, using Quantum Natural Gradients enabled the optimizations to converge to errors below the chemical accuracy threshold (≈ 10 −3 Ha) within 200 descent steps.Referring back to Fig. 5, these three circuits correspond to curvature landscapes with extensive regions of negative curvature or high concurrence.For sHEA, θ 3 and θ 4 were tuned over the course of each optimization such that highly entangled states became accessible (e.g.θ 3 ≈ π/2).
In particular, optimization runs for LDCA started at higher values of concurrence on average and rapidly converged to accurate ground state energies within around 20 descent steps for QNG methods.In addition, the standard deviation over 50 runs is smaller than those corresponding to other circuits.
Additional insights into the optimization procedure can be gathered by looking at the curvature landscapes and also the scalar curvatures reached during the optimization process.
(i) For the circuits that reach the ground state, we observe more regions of negative curvature; these regions in the Hilbert space are useful in accelerating the optimizations.Having more of these regions seems to correlate with performance of the optimization process.Consider the curvature landscape of LDCA, in which we observe that there are small, repeating hills of positive curvature that are surrounded by regions of negative curvature.This implies that LDCA may have a greater access to highly entangled states that are additionally constrained to a subspace spanned by {|01 , |10 }, both of which led to superior performance in the VQE algorithm instance.On the other hand, if one looks at the QGAN curvature landscapes, we see that for majority of parameter settings the curvature is positive.
(ii) Because the Quantum Natural Gradient is tailored for the wave function geometry, we see that it is able to reach these areas of negative curvature faster than standard gradient descent.Once it reaches these regions, the optimization process is sped up earlier on in the process allowing for a lower number of iterations.In all the cases looked at, the standard gradient descent takes longer to find these regions.
(iii) Lastly, we briefly comment on the relative costs of Quantum Natural Gradient, which depend on the number of non-zero elements in the metric tensor or its approximations.From Eqs. 23-26, we see that the diagonal or block-diagonal approximation of the metric tensor for LDCA captures all of the non-zero elements.This is in contrast to the other circuits which have at least two unique elements that are not captured by the approximations to the metric tensor.To make better use of QNG, these other circuits may require computations of elements that are not captured by the block-diagonal or diagonal approximations.For example, in Fig. 6d, we ran the VQE optimizations using the dense or full metric tensor for sHEA, a case in which the block-diagonal approximation does not capture many of the non-zero elements.Shown using purple lines, we observe that the optimization significantly improves in efficiency though at the cost of more function calls needed at each QNG descent step to compute the full metric tensor.On the other hand, LDCA only requires one non-zero element of the metric tensor to be computed at each QNG descent step.This appears to imply that certain circuits, by the way that they are parameterized, are better suited for QNG methods than others.

Non-trivial role of adding single-qubit gates
As observed in Fig. 6b, QGAN consistently fails to reach the ground state.QGAN, as constructed, is unable to reach states corresponding to high entanglement, or equivalently negative curvature.Often, in similar situations in which the circuit seems insufficient, one adds a set of gates or a circuit layer in the hopes of providing greater flexibility to reach the solution state.Thus, we tried augmenting the QGAN circuit with local rotations, adding RX then RZ single-qubit gates to each of the two qubits.This adds four new parameters to the circuit.Appending local rotations should not impact the concurrence and thus the curvature.It was surprising to observe, however, that adding local rotations greatly improved the optimization as shown in Fig. 7a.While the average value of concurrence at the start of the optimization using the augmented QGAN was lower than that corresponding to the original QGAN circuit, the updated QGAN circuit was able to estimate the ground state energy with high accuracy using quantum natural gradient methods.The puzzle is then the following: while the concurrence expression in Eq. (32) should not change for the augmented QGAN (since single qubit gates should not increase entanglement), the additional parameterized singlequbit gates somehow appear to have aided in guiding the other circuit parameters to values such that highly entangled states are accessible (shown in Fig. 7c).In other words, without increasing the entanglement in the circuit, we are able to turn an optimization procedure that does not get to an entangled state to one that does.We suspect that this is where another part of the geometry plays a part, namely the fibers.Locally, we have the geometry being S 4 × S 3 but this is not true globally, which means that how we move through the fibers has a non-trivial impact on the optimization process.This case also illustrates the role of geometry plays in the optimization process because the standard gradient descent still fails to find the ground state, at least within a reasonable number of optimization steps.From the point of view of just the optimization process, we simply see the concurrence fails to reach the target value and the energy fails to reach low enough values, but as we look at what happens to the scalar curvature, we see the Quantum Natural Gradient descent finds pockets of high negative curvature allowing for the optimization path to more easily move.In other words, as we change parts of the available geometry, the Quantum Natural Gradient is able to better leverage that.

Concluding remarks
Often, investigations or uses of PQCs are limited to considering their inputs (parameters) and outputs (resulting quantum states or observables).Our work provided an investigation into the inner workings of two-qubit PQCs, which we argue are the simplest instances of PQCs and yet can still provide valuable insights for better understanding these circuits.We first defined PQCs from a geometric perspective.With the help of specifically two-qubit geometry, we explicitly worked out the intrinsic co-ordinates for four examples of two-qubit PQC blocks, at least for the base manifold, which is important for characterizing entanglement in the circuit.As a consequence of our ability to parameterize our circuits in terms of two-qubit geometry, we introduced a notion of local geometric expressibility, which describes how much of the two-qubit geometry can be explored by a two-qubit PQC.
In trying to understand the connection between the geometry of the projective Hilbert space and how it is used by the Quantum Natural Gradient in the optimization process, we used the Mannoury-Fubini-Study metric as a way to calculate the curvature of the base manifold, S 4 .This provided a simple and remarkable connection between the curvature of the base manifold and the amount of entanglement in the ansatz.
With this connection we were able to establish a bridge between the amount of entanglement and quality of the optimization procedure to the geometry of the projective Hilbert Space.This allowed us to notice a correlation between the ability of the ansatz to find earlier on in the optimization process regions of high negative curvature and acceleration of the optimization process.We strongly suspect that this connection is not merely a correlation but represents a chain of causation, for this we give two pieces of evidence: (i) The performance of the QGAN circuit which could not find either the entangled ground state or the product ground state was enhanced by creating the augmented QGAN ansatz i.e. by adding single qubit rotations.The optimization process was able to reach regions of high negative curvature that were not accessible before and after which the circuit was able to reach the entangled ground state.Furthermore, in the numerical simulations for which the ground state is a product state and hence lies in a region of positive curvature (Appendix D), the augmented QGAN performed better by initially finding regions that are less positive than the original QGAN.
(ii) Secondly, by inspecting the performance of standard gradient descent we see that for the circuits that did find the ground state, the regions of high negative curvature were found later in the optimization process than in the case for QNG.Indeed the behavior of the standard gradient descent for the augmented QGAN we believe is highly suggestive.The standard gradient descent in the augmented QGAN ansatz is not able to find these regions of high negative curvature and is not able to reach the ground state for a circuit we know can find the ground state.On the other extreme end, we can look at the performance of sHEA and consider the performance of the dense QNG which finds region of negative curvature the earliest, one sees that this correlates with finding the ground state much more efficiently than the other approximations of QNG and the standard gradient descent.
In summary, the geometric perspective through the use of the scalar curvature has provided us with a tool to explain why, for example, why LDCA outperforms other circuits other than just explaining via tracking energies or entanglement.Overall, we show that the number of parameters of a given PQC does not necessarily correspond to high circuit flexibility or capability; it matters how the circuit is parameterized.In connection with this point, we observed how single qubit gates can significantly impact the circuit performance, i.e. do more than just provide extra parameters.The single qubit gates allow us to explore the fibers of the geometry in such ways as to find regions of negative or less positive curvature.This effect is ultimately possible because the geometry is not just a cartesian product of the base manifold and the fibers but some complicated twist.The geometric perspective also provides insights into how and why the QNG is better than the standard gradient descent; the QNG can find the regions of negative curvature, and if there are none, it opts for regions with less positive curvature early in the optimization process.We stress that the geometric perspective is necessary in order to understand the QNG since the QNG does not depend on the cost function.
While we provided a extensive study on two-qubit PQCs, there remain several puzzles or open questions, which we outline in the following subsection for future work.

Decomposition Problem
Refs. [16,26] numerically showed the potential for Quantum Natural Gradients in optimizing parameterized quantum circuits in the context of VQE.A main challenge will be to scale up this method for larger and deeper quantum circuits.In an effort to formulate a hopefully simpler scenario that may be able to be scaled, we describe the following "decomposition problem": Suppose we have a four-qubit circuit, the first and second qubits are entangled using one of the PQC building blocks, e.g.LDCA block while the third and fourth qubits are also entangled using a two-qubit block.Overall, we have 2 two-qubit subsystems, each of which we know the metric tensor, call them g (12) and g (34) .However, suppose we add a static/nonparametric entangler (e.g.CNOT) to the second and third qubits.How is the metric tensor of the four-qubit system g (1234) constructed, compared to structures of g (12) and g (34) ?The idea being, rather than anaylzing the n qubit geometry from the ground up in order to understand PQCs for n qubits, can we use the simpler geometry of 2 qubits to bootstrap our way to understanding higher qubit number PQCs?

Three-qubit case
For this work, we crucially relied on the fact that the geometry of 2 qubits could easily be understood since it was simply the Hopf fibration S 7 .As a consequence of Hopf Invariant One theorem [47], there is one more Hopf fibration that can be studied in great detail namely the Hopf fibration S 7 → S 15 π  − → S 8 .In this case, the use of octonions should be helpful in parameterizing the geometry.

B.4 Sycamore hardware-efficient ansatz (sHEA)
The output state of the Sycamore hardware-efficient ansatz (sHEA) depicted by Fig. 3(d) is For the other two circuit blocks, in which R is a function of more than two circuit parameters, R is plotted as a function of θ 1 and θ 2 , while scanning specific values of the remaining parameters to produce snapshots of the landscapes.We limit the range of R to [-5, 10] for ease of visualization.The angle parameters of the S 4 Hopf base (θ A , φ A ) and (χ, ξ) and the equivalent Cartesian coordinates (x 0 , x 1 , x 2 , x 3 , x 4 ) are calculated to be x 0 = 1 2 (cos (θ 2 ) (cos (θ 3 ) − 1) + cos (θ 1 ) (cos (θ 3 ) + 1)) (99)  The quaternion q ± = c ± |ψ H is calculated to be

C Geometric origins of the quantum natural gradient
Considering the geometric point of view, how might one arrive at the concept of the quantum natural gradient?The idea is to simply find out how the gradient operation would change in different curved geometries.For the quantum mechanics, we would need to consider the geometry of P(H).
Using the correspondence between a vector space V and its dual V * for finite dimensional spaces, we can pair the co-vector df with the vector ∇f i.e df (w) = ∇f, w (116 where •, • is a chosen bilinear form.Now consider the following calculation: v, w = ij v i e i , e j w j = ν(w), where v, w ∈ V and ν ∈ V * and {e i } are basis for V .We can expand ν in a basis {σ i } for V * so that ν = i v i σ i with v i = ν(e i ) = v, e j .From the fact v i = j g ij v j = i g ij v, e j , we have By the discussion above, we have that Considering (116) we have that df (w) = ∇f, w = w(f ) = j w j ∂f ∂x j so that combining knowledge from (117) and (118) we have ∇, e j = e j (f ) = ∂f ∂x j and thus arrive at Now in Euclidean geometry df and ∇f have the same co-ordinates but in general they do not.For P(H) we pick the metric to be Fubini-Study metric and thus arrive at the following update rule for the parameters: D Further insights into VQE performance through scalar curvature: unentangled ground state In this section, we consider another regime of the same VQE problem from Sec. 4. At r H−H = 0.2Å, the ground state wave function is a product state: with |β| 2 ≈ 1.We repeat the simulation procedure, running 50 independent VQE optimizations for each of the four two-qubit circuits, as shown in Fig. 8.We observe similar results for r H−H = 0.2Å as those for r H−H = 3.19Å; using LDCA, the optimizations are both precise and accurate.While initial concurrence values start at high values, they rapidly decrease to near 0. This corresponds to starting in a negative curvature region but quickly moving up to a hill of positive curvature.QGAN, in its original structure, is again insufficient for converging to the ground state energy.However, with added local rotations (Fig. 8e), the optimizations reach sufficient accuracy.

( a )
Fiber bundle picture of a cylinder (b) Mobius strip; the lines represent the fibers attached with a global twist

Figure 2 :
Figure 2: Fiber bundle perspective of a cylinder (a) and a Möbius strip (b).The lines in both pictures represent the fibers, while the circle represents the base manifold.The fiber bundle perspective helps us think locally of complicated manifolds in terms of smaller dimensional manifolds.

Figure 3 :
Figure 3: Two-qubit circuit blocks considered in this work.Dashed lines indicate different circuit layers or moments.Gate definitions are provided in Appendix A.

Figure 5 :
Figure 5: Scalar curvature "landscapes" of the four circuit blocks: (a) HEA, (b) LDCA, (c) QGAN, and (d) sHEA.In the case of (a) and (b), the scalar curvature (denoted using R) is a function of only two parameters: θ 1 and θ 2 for HEA and θ 3 and θ 5 for LDCA.For the other two circuit blocks, in which R is a function of more than two circuit parameters, R is plotted as a function of θ 1 and θ 2 , while scanning specific values of the remaining parameters to produce snapshots of the landscapes.We limit the range of R to[-5, 10]  for ease of visualization.

Figure 6 :
Figure 6: VQE results over 50 random initial points using the four two-qubit ansatze for estimating the ground state energy of molecular hydrogen at R H-H = 3.19Å.Ground state at this bond length is highly entangled.Solid lines show the average values of energy error, concurrence, and scalar curvature.
t e x i t s h a 1 _ b a s e 6 4 = " K 9I 1 A c q g 8 O g M 9 u v + J H r r u V h r b k 0 = " > A A A C T 3 i c b V B N b x M x E P W G F v o F T Y E b F 6 t R p Z 6 i N a p U j h Vc e i w S I Z W y q 8 j r z C Z W / L G y Z w v B 7 H / p t f 0 1 X P k j H B H e s A f S 8 i T L T + / N e M a v q J T 0 m K Y / k 9 6 T r e 2 n z 3 Z 2 9 / Y P n r 8 4 7 B + 9 / O x t 7 Q 4 h b 8 g x O S W M n J M L c k m u y I g I 8 o 3 c k j t y n / x I f i W / e 1 1 p L + n I K 7 K B 3 u 4 f C P e 0 G w = = < / l a t e x i t > a. < l a t e x i t s h a 1 _ b a s e 6 4 = " w N e P x C e n K e y 5 i v q S U 7 S Q i k 8 y L 5 E = " > A A A C T 3 i c b V B N b x M x E P W G F v o F T Y E b F 6 t R p Z 6 i N a p U j h V c e i w S I Z W y q 8 j r z C Z W / L G y Z w v B 7 H / p t f 0 1 X P k j H B H e s A f S 8 i T L T + / N e M a v q J T 0 m K Y / k 9 6 T r e 2 n z 3 Z 2 9 / Y P n r 8 4 7 B + 9 / O x t 7 Q

< l a t e x i t s h a 1 _
b a s e 6 4 = " 2 m j r P P 9 c 6 s 3 s A h G Y f 7 g M d w 2 H o S 8 = " > A A A C T 3 i c b V B N b x M x E P W G F v o F T Y E b F 6 t R p Z 6 i N a p U j h V c e i w S I Z W y q 8 j r z C Z W / L G y Z w v B 7 H / p t f 0 1 X P k j H B H e s A f S 8 i T L T + / N e M a v q J T 0 m K Y / k 9 6 T r e 2 n z 3 Z 2 9 / Y P n r 8 4 7 B + 9 / O x t 7 Q

Figure 7 :
Figure 7: (a) Energy error, concurrence, and curvature tracked over 50 independent optimization trials of QGAN augmented with local rotations (RX and RZ applied to each qubit).(b) Curvature landscape corresponding to final/optimized parameter value of θ 5 of a particular optimization run using the original QGAN circuit.(c) Curvature landscape corresponding to final parameter value of θ 5 of a particular optimization run using augmented QGAN circuit.

Figure 8 :
Figure 8: VQE results over 50 trials using the four two-qubit ansatze for molecular hydrogen where R H-H = 0.2 Å.Each ansatz is optimized using gradient descent and quantum natural gradient (QNG) using the block-diagonal and diagonal approximations to the metric tensor.Lines in each plot indicate the average quantity, and shaded regions indicate quantities within one standard deviation.

Table 2 :
PQC Geometry Summary Table

Table 3 :
Abbreviations and symbols Accepted in Quantum 2022-08-03, click title to verify.Published under CC-BY 4.0.