Fluid fermionic fragments for optimizing quantum measurements of electronic Hamiltonians in the variational quantum eigensolver

Measuring the expectation value of the molecular electronic Hamiltonian is one of the challenging parts of the variational quantum eigensolver. A widely used strategy is to express the Hamiltonian as a sum of measurable fragments using fermionic operator algebra. Such fragments have an advantage of conserving molecular symmetries that can be used for error mitigation. The number of measurements required to obtain the Hamiltonian expectation value is proportional to a sum of fragment variances. Here, we introduce a new method for lowering the fragments' variances by exploiting flexibility in the fragments' form. Due to idempotency of the occupation number operators, some parts of two-electron fragments can be turned into one-electron fragments, which then can be partially collected in a purely one-electron fragment. This repartitioning does not affect the expectation value of the Hamiltonian but has non-vanishing contributions to the variance of each fragment. The proposed method finds the optimal repartitioning by employing variances estimated using a classically efficient proxy for the quantum wavefunction. Numerical tests on several molecules show that repartitioning of one-electron terms lowers the number of measurements by more than an order of magnitude.

VQE circumvents the hardware limitations of today's noisy intermediate-scale quantum (NISQ) devices [6] by exploiting both quantum and classical computers: A quantum computer prepares a parameterized trial state |ψ θ and measures its energy E θ = ψ θ |Ĥ|ψ θ , while a classical computer suggests new θ parameters to minimize E θ (for brevity, we omit θ from now). However, measuring the expectation value E is not trivial because digital quantum computers can only measure polynomial functions of Pauli-ẑ operators.
To find the fragments,Ĥ α , starting from the Hamiltonian in the second-quantized form, one can employ either the low-rank (LR) decomposition [7][8][9] or the full-rank (FR) optimization [10,11]. Both methods are only concerned with the two-electron fragments becauseÛ 0 for the one-electron fragment (Ĥ 0 ) can be found easily as it simply corresponds to a unitary matrix that diagonalizes the one-electron tensor, h pq [14]. The difference between LR and FR fragments is the rank of the resulting λ (α) pq [in Eq. (2)]. In the LR decomposition, λ (α) pq is an outer product of some vector η (α) q ) and, therefore, has rank 1. In contrast, in FR optimization, λ (α) pq can be a full-rank hermitian matrix. Increased flexibility of FR fragments was exploited in Ref. 10 to lower the number of measurements required to obtain E up to error when eachĤ α is measured independently [15]: where Var ψ (Ĥ α ) = ψ|Ĥ 2 α |ψ − ψ|Ĥ α |ψ 2 is the variance ofĤ α , and m α is the fraction of the total measurements allocated toĤ α . Developing techniques for measuring E with a low M ( ) is especially important for VQE because a recent analysis [16] showed that the advantage of VQE over state-of-the-art classical algorithms is limited due to the large M ( ).
The qubit-space methods with the lowest M ( ) take advantage of the flexibility in the fragments offered by the realization that someP j can belong to multipleĤ β . The coefficients ofP j in different H β , c (β) j , can be varied without changing the total expectation value ofĤ q as long as c (β) j sum to c j in the qubit Hamiltonian [25]. In addition, even P j not present inĤ q can be introduced into mul-tipleĤ β provided that corresponding c (β) j sum to zero [26]. A significant reduction in M ( ) was achieved by optimizing c (β) j using approximate variances obtained by employing a classically efficient wavefunction, |φ , to estimate Var φ (Ĥ β ). The idea of increasing the number ofP j measured simultaneously in a single fragment has been successfully employed also in the recently developed classical-shadow-based techniques [18][19][20][21][22][23] to yield lower M ( ) values. An alternative class of promising qubit-space approaches with M ( )'s competitive with those in some of the Hamiltonian partitioning schemes lowers M ( ) by optimizing positive operator-valued measures (POVMs) [29][30][31].
In this work, we present an extension to fermionic fragment techniques that further increases their flexibility in reducing the number of measurements. The new approach generalizes the technique of repartitioning of some fragments used in the qubit space. It extends the repartitioning idea from commuting to non-commuting operators. Another motivation for developing fermionic measurement schemes is their advantage compared to the qubit-space counterparts in conserving molecular symmetries (e.g., electronic number and spin operators). These symmetries can be used for error mitigation techniques, which are essential for advancing quantum computing schemes on near-term devices [32,33].

Fluid fermionic fragments
Here, we present a new approach that exploits two properties of fermionic operators to minimize the number of measurements in Eq. (4). First, any linear combination of one-electron hermitian operators can be brought to the factorized form p , and p are some real coefficients, andÛ α ,Û are orbital rotations. Second, the occupation number operators are idempotent, i.e., n 2 p =n p . Using then p idempotency, eachĤ α in Eq. (2) with α > 0 can be re-written as a sum of oneand two-electron parts: This expression reveals the freedom that one can extract any amount of the one-electron part from everyĤ α and add it toĤ 0 , thereby repartitioning the one-and two-electron Hamiltonians. Thus, the repartitioned fragments, which we will refer to as fluid fermionic fragments (F 3 ), arê Even after the modification, eachĤ α for α > 0 remains measurable becauseÛ αĤαÛ † α still maps onto a polynomial function of Pauli-ẑ after qubitfermion transformations. ForĤ 0 , newÛ 0 and λ p can easily be found by simply diagonalizing h pq = pq is an N × N matrix representation ofÛ α [10,11,14]. MeasuringĤ α instead ofĤ α gives the same E because repartitioning does not change the operator sum:

Optimization of the number of measurements
In the following, we will present the approach for optimally repartitioning the one-and twoelectron fragments to lower M ( ) (initial fragments are obtained as described in Appendix A). Since M ( ) depends on the fragment variances evaluated with the quantum wavefunction |ψ , which is classically difficult, we minimize the approximation to M ( ) computed with Var φ (Ĥ α ): in this work, the configuration interaction singles and doubles (CISD) wavefunction was used as the classically efficient proxy for the quantum wavefunction (|φ ). The variances of the fragments after the repartition [Ĥ α in Eqs. (9) and (10)] are obtained as =Û † αn pÛα for notational simplicity. To minimize Eq. (11) with respect to c (α) p and m α , we perform two-step iterative optimization following Refs. 25 and 26: 1) c (α) p are optimized with fixed m α and 2) m α are updated to the optimal allocation according to Eq. (22) with fixed c (α) p . As this iterative procedure is a linearization heuristic for the non-linear problem of simultaneously optimizing c (α) p and m α , the obtained solution does not necessarily correspond to the minimum. Yet, in practice, we could typically reach convergence within ∼ 5 iterations to find solutions with low gradients. To optimize the c (α) p variables with fixed m α , one can solve the system of linear equations: The final m α obtained at the end of the iterative procedure suggests the optimal allocation of the total budget for eachĤ α . The suggested M m α measurements for each fragmentĤ α is not an integer, but rounding M m α to the nearest integer should only have a negligible effect on the measurement error because M 10 6 in practice. We will refer to the algorithm proposed in this work as the fluid fermionic fragments (F 3 ) method.

Reducing the number of optimization variables
The computational cost for optimizing c  14) has an approximately cubic scaling with N c , the cost of F 3 can be lowered significantly by reducing N c . Therefore, we propose several restrictions on c (α) q to lower their number.
Using spin symmetry of the electronic Hamiltonian written in a spin-restricted spin-orbital basis, we achieve a twofold reduction in the number of c 2i,2i for i = 1, . . . , N/2 in the initialĤ α fragments obtained by considering the smallerg ijkl tensor over spatial orbitals (see Appendix A for definitions). Because λ 2i,2i , we impose that the same amount of λ i , thereby reducing the number of optimization variables by half. In the resulting F 3 -Full method, N c = N f N/2, and the system of equations is simplified to This restricts us to repartitioning of only a fraction of the entire one-electron part of a two-electronĤ α fragment [i.e., a fraction of To this end, we restrict c (α) p as a scalar multiple of λ pp . As a result, N c = N f , and the system (14) simplifies down to p . We will refer to this reduced version of F 3 as F 3 -R1.
Yet another reduction of variables can be done and is motivated by the relationship between [Var ψ (Ĥ α )] 1/2 appearing in the expression for the total measurement number with optimal m α [M opt ( ) in Appendix A] and the L 1 norm of a coefficient vector for a linear combination of unitaries (LCU) decomposition:Ĥ α = j a (α) jV j + d α1 , whereV j are some unitaries [34]. Maximum [Var ψ (Ĥ α )] 1/2 for any |ψ occurs when |ψ = (|max α + |min α )/ √ 2, where |max α (|min α ) is the eigenstate ofĤ α with the highest (lowest) eigenvalue, E Thus, one can useĤ α with a low LCU L 1 norm as a heuristic approach to lowering M ( ).
One way to reduce the L 1 norm for a collection ofĤ α while maintaining their measurability is substituting everyn p operator in the two-electron H α fragments with reflections:r p = 1 − 2n p (satisfyingr 2 p = 1 andr † p =r p ) [7,[34][35][36]. Becausê r p maps onto an all-ẑ Pauli product under all standard transformations (e.g., it maps toẑ p under the Jordan-Wigner transformation) [13], the fragment remains measurable even after the substitution:Ĥ To ensure that N f α=0Ĥ α =Ĥ, the one-electron H 0 term must also be modified aŝ pq . Inspired by this connection to F 3 and the reduction in the upper bound for the fragment variances achieved by the substitution ofr p , we heuristically propose F 3 -R2 that restricts the optimization variables as c pq . Note that this choice is equivalent to substitutingr p only for a fraction of the two-electron fragment (c (α)Ĥ α ) while leaving the rest [(1 − c (α) )Ĥ α ] as functions ofn p . The c (α) variables in F 3 -R2 are optimized by solving the system of equations identical to Eq. (16)

Results and discussions
We compare M ( ) in the different versions of F 3 applied to either LR or FR fragments with M ( ) in the initial LR and GFRO fragments (see Appendix A for definitions). In addition, the performance of F 3 is compared with the best qubit-space techniques that have the lowest M ( ): the iterative coefficient splitting (ICS) [25] and shared Pauli products (SPP) [26] methods. In particular, it was shown in Ref. 25  The presented 2 M ( ) value is equivalent to the number of measurements in millions required to obtain E with 10 −3 a.u. accuracy (see Appendix B). While the optimization in F 3 is performed using the covariances approximated with the CISD wavefunction, |φ , once the optimal fragments and measurement allocation are obtained, the 2 M ( ) values are then evaluated according to Eq. (4) using the exact covariances computed with the quantum wavefunction, |ψ .
[Note that the use of approximate covariances instead of the exact covariances in F 3 had an insignificant influence on 2 M ( ). In all systems,   Table 1 shows that between the different versions of F 3 , the most flexible full version yields the lowest M ( ) in all examples: on average, M ( ) in F 3 -Full is a factor of 11 lower than that in F 3 -R1 and a factor of 1.2 lower than that in F 3 -R2. However, as shown in Table 2, the increased flexibility also results in F 3 -Full having N/2-fold more optimization variables than the other versions. While F 3 -R1 and F 3 -R2 have the same N c , F 3 -R2 has a much lower M ( ) for many of the molecules; for the others, it has a similar M ( ) to that in F 3 -R1. The success of F 3 -R2 can be heuristically justified since it is designed to lower the LCU L 1 norm, which is an upper bound for fragment variances, as discussed in Sec. 2.3. Because F 3 -R2 can achieve a lower M ( ) with the identical computational cost as F 3 -R1, there is no reason to employ F 3 -R1 instead of F 3 -R2. Therefore, we omit F 3 -R1 in the following discussion.
The comparison of M ( ) in F 3 with that in the initial set of LR or FR fragments demonstrates the success of the proposed method. On average, M ( ) in F 3 -Full is a factor of 35 lower than that in the initial fragments, and M ( ) in F 3 -R2 is a factor of 24 lower than that in the initial fragments. Since M ( ) in GFRO is always lower than that in LR, M ( ) in F 3 is also typically lower when FR fragments are used as the initial fragments. However, finding the initial GFRO fragments involves an iterative nonlinear optimization procedure which is computationally more expensive than the LR decomposition. Moreover, while N f in LR is upper bounded by (N 2 /8 + N/4), N f in GFRO is usually higher (for the presented molecules, GFRO has, on average, three times more fragments). Since the number of optimization variables is directly proportional to N f (N c = N f N/2 in F 3 -Full and N c = N f in F 3 -R2), employing F 3 on the FR fragments requires more computational effort than employing it on the LR fragments. Table 2, N c scales as N x , where x = 2.8 (for F 3 -LR-Full), x = 1.8 (for F 3 -LR-R2), x = 3.7 (for F 3 -FR-Full), and x = 2.7 (for F 3 -FR-R2). Because the bottleneck in the F 3 method is solving the system of linear equations [i.e., solving Eq. (15) or (16)], which has ∼ N 3 c scaling, the classical optimization costs in the different versions of F 3 have approximately N 8.5 , N 5.5 , N 11 , and N 8.2 scaling. Therefore, F 3 -LR-R2, with its classical computational cost scaling as ∼ N 5.5 with system size (N ), would be particularly useful for larger systems.

For systems in
Comparison of F 3 with state-of-the-art qubitspace techniques (ICS and SPP) shows that for all molecules except BeH 2 , F 3 has lower M ( ). In particular, even the computationally most efficient combination of applying F 3 -R2 to the LR fragments yields a lower M ( ) than that in ICS, whereas this most efficient combination yields a similar M ( ) to that in SPP. Furthermore, M ( ) in the best fermionic-space technique (F 3 -FR-Full) is, on average, a factor of 1.6 smaller than M ( ) in the best qubit-space technique (SPP). The bottleneck in the optimization procedures in all three methods (F 3 , ICS, and SPP) is solving a system of linear equations. Therefore, one can assess the required computational cost by examining the number of optimization variables. Table 2 shows that even the most computationally costly combination, F 3 -FR-Full has a much lower N c than that in either ICS or SPP. The classical computational costs of ICS and SPP have approximately N 12 and N 15 scaling, which are worse even than that in F 3 -FR-Full, with the worst scaling among different versions of F 3 .

Conclusion
This work proposes a new method that achieves a significant reduction in the number of measurements by taking advantage of the possibility of extracting fractions of one-electron parts from the two-electron fermionic fragments and combining them with the purely one-electron fragment. This repartitioning keeps the fragments measurable and conserves the Hamiltonian expectation value. On the other hand, the number of measurements due to its dependence on variances of fragments can be lowered by this repartitioning. The proposed algorithm finds the repartitioning that minimizes the number of measurements and achieves a severalfold reduction in the number of measurements compared to those for initially generated fragments.
Even though the number of measurements in the previously proposed methods suggested that the qubit-space techniques are superior to their fermionic counterparts, by employing the fluid fermionic fragments method, we were able to achieve the number of measurements lower even than those in the best qubit-space techniques (ICS [25] and SPP [26]). Furthermore, compared to these techniques, the method presented here was shown to have much fewer optimization variables. In particular, the number of optimization variables in the computationally most efficient version of the proposed algorithm scales sub-quadratically with the number of spatial orbitals, thereby making the algorithm applicable for larger systems. In addition, while the errors due to the non-unit quantum gate fidelities were not considered in this work, fermionic fragments conserve molecular symmetries and their measurements can be corrected by error mitigation methods employing these symmetries [38][39][40]. Therefore, the advantage of the proposed fluid fermionic fragments method over the qubitspace techniques should be even greater when quantum hardware errors are taken into account.
While this work focused on the measurements in VQE for ground-state energy estimation, some promising extensions of VQE, including those for excited-state calculations [41][42][43], require measuring expectation values of additional effective Hamiltonians containing three-and four-electron terms. A common approach for lowering the number of measurements required to obtain these additional expectation values relies on improving the efficiency of simultaneously estimating all three-and four-body reduced density matrices [44,45]. Alternatively, as a potential extension of this work, one could develop a more targeted algorithm that finds optimal fragments for only a few necessary effective Hamiltonians. It would be informative to examine whether this more targeted approach can further reduce the required number of measurements.
Lastly, although we employed the fluid fermionic fragments method to optimize quantum measurements, it can also be used to optimize Hamiltonian fragments for other purposes. In the fault-tolerant paradigm, one can solve the electronic structure problem by time evolving the initial guess state (with a good overlap with the ground state ofĤ), then applying the quantum phase estimation algorithm [46]. A commonly employed technique for the time evolution is Trotterization [47,48]. Because the Hamiltonian fragments required for Trotterization are equivalent to the measurable fragments considered in this work, the fluid fermionic fragments method can also be applied in the context of Trotterization.
The only necessary modification is that instead of minimizing the number of measurements required in a VQE step, one would employ the fluid fermionic fragments method to minimize the Trotter error [48,49].

Appendix A: Initial Hamiltonian fragments
The LR decomposition is less ambiguous compared to its FR counterpart, and we use an LR decomposition procedure described in Ref. 9. Among different implementations of the FR decomposition, we employ the "greedy" FR optimization (GFRO), since it has the lowest M ( ) [10]. Greedy algorithms typically have low M ( ), and their success can be attributed to the sum of square roots appearing in obtained by choosing the optimal measurement allocation, that minimizes M ( ).
For a fixed sum of Var ψ (Ĥ α ), the sum of square roots in Eq. (21) is lower if the variances are distributed unevenly, and greedy approaches tend to yield an uneven distribution of Var ψ (Ĥ α ).
The computational effort of both LR and FR decompositions is reduced significantly by work-ing with a smaller two-electron tensor over spatial orbitals,g ijkl , instead of the tensor over spinorbitals, g pqrs .  (23) where σ and σ specify the spin-z projection, whileh ij andg ijkl are one-and two-electronic integrals: (24) and where φ i ( r) is the ith one-particle electronic basis function in the position representation, and the charge and position of the Ith nucleus are denoted by Z I and r I .
To optimize the computational cost for LR and FR decompositions, we work withg ijkl and then subsequently convert the resultingλ 2i−1,2j = 0 (27) for i, j = 1, . . . , N/2. The LR decomposition is particularly efficient because it finds the rank-1 matricesλ by diagonalizing the two-electron tensor g ij,kl considered as a matrix with each dimension spanned by a pair of basis indices. This diagonalization gives a theoretical limit on N f ≤ N o (N o +1)/2, where N o = N/2, and the less-than sign originates from a truncation of the expansion by removing terms for low magnitude eigenvalues (see Ref. 8 for further details on LR decomposition). Having a low N f is beneficial for the fluid fermionic fragments method because the number of optimization variables (c (α) p ) is directly proportional to N f . The classical computational cost of LR has ∼ N 6 o scaling due to the cost of diagonalization, whereas the precise scaling for GFRO cannot be determined since it is a heuristic algorithm. In the examples considered in this work, the classical cost of GFRO was comparable to that of the F 3 optimization. Typically, the LR decomposition requires less computational effort than GFRO, but M ( ) in GFRO is lower owing to more flexibleλ (α) ij [10,11]. The αth GFRO fragment,Ĥ α , is found by minimizing the L 1 norm of theG (α+1) tensor in whereG (1) ijkl =g ijkl . In each iteration, the L 1 norm ofG (α+1) is minimized over the space of {λ (α) ij ,θ (α) } variables parameterizing the αth fragment, . The iteration terminates when the L 1 norm of G α+1 is below a given threshold (1 · 10 −5 is used in this work).
Appendix B: Number of measurements required to obtain the Hamiltonian expectation value In every method considered in Sec. 3, the expectation value of the Hamiltonian, E = ψ|Ĥ|ψ , is obtained as the sum of expectation values of the separately measuredĤ α fragments [see Eq. (3)]. Therefore, the most straightforward estimator for E (denoted byH) is the sum of estimators for individual fragments, i.e.,H = αH α . EachH α is the average result of M α repeated quantum measurements ofĤ α : where H α,i is the ith outcome of measuringĤ α . Because each fragment is measured independently, the covariances between the fragments, Cov(H α ,H β ), are zero, and therefore the variance ofH is simply the sum of variances ofH α : Var(H) = α Var(H α ). (31) Assuming that each M α is large, we can invoke the central limit theorem to evaluate Var(H α ) using the quantum operator variances, i.e., Var(H α ) = Var ψ (Ĥ α )/M α . Therefore, the Hamiltonian estimator variance can be evaluated as To obtain the expression for M ( ), i.e., the total number of required measurements to obtain E with ≡ [Var(H)] 1/2 accuracy, we introduce fractional measurement allocation, m α = M α /M (with α m α = 1), then rearrange Eq. (32) as