Quantum algorithm for persistent Betti numbers and topological data analysis

Topological data analysis (TDA) is an emergent field of data analysis. The critical step of TDA is computing the persistent Betti numbers. Existing classical algorithms for TDA are limited if we want to learn from high-dimensional topological features because the number of high-dimensional simplices grows exponentially in the size of the data. In the context of quantum computation, it has been previously shown that there exists an efficient quantum algorithm for estimating the Betti numbers even in high dimensions. However, the Betti numbers are less general than the persistent Betti numbers, and there have been no quantum algorithms that can estimate the persistent Betti numbers of arbitrary dimensions. This paper shows the first quantum algorithm that can estimate the (normalized) persistent Betti numbers of arbitrary dimensions. Our algorithm is efficient for simplicial complexes such as the Vietoris-Rips complex and demonstrates exponential speedup over the known classical algorithms.


Introduction
Processing large data has become a highly demanding task in modern society. Topological data analysis (TDA) is an emergent field of information processing based on topological and geometric methods [13,40]. In particular, TDA based on persistent homology [6,14,15,32,42] has been gathering a lot of attention recently. TDA can reveal the "shape of data" with multi-scale analysis of topological features. A crucial property of persistent homology, which is favorable in practical situations, is its stability against the noise concerning how the data is sampled [8]. There have been numerous applications of TDA, including the protein structure [41], the quantum many-body dynamics [37], the cosmic web [33], and the string theory [9], to name a few. Recently, combinations of persistent homology with supervised or unsupervised machine learning methods have also been actively studied [34].
In the procedure of TDA, data is first converted into a nested sequence of topological objects. Such a topological object is called a simplicial complex, and a nested sequence of simplicial complexes is called filtration. We can then study how topological invariants, such as the number of connected components, holes, voids, and their high-dimensional counterparts, change within the filtration using persistent homology. Typically, topological invariants that persist longer are considered to be more "significant" than those that persist shorter. The behavior of topological invariants is summarized as diagrams or functions in the form such as the persistent Barcodes [17] and the persistent Landscapes [5]. Therefore, a typical procedure of TDA consists of first constructing a family of topological objects from the given data and then computing the persistence of topological invariants, i.e. computing the persistent Betti numbers. (For the formal definition of persistent Betti numbers, see Section 2).
There are challenges in classical TDA that arise from the combinatorial nature of the topological treatment of data. Assume that the input data is composed of n points. Then, the number of possible q-dimensional elements (i.e. q-simplices) that can contribute to topology is n q+1 . (We say a simplex is q-dimensional if it is composed of q + 1 vertices.) The known classical algorithms for persistent homology work in polynomial time in the number of simplices [30,42]. Classical algorithms work in polynomial time in n for constant dimensional cases because the number of constant-dimensional simplices is at most polynomial in n. On the other hand, there can be exponentially many simplices for super constant dimensions. As a result, performing a high-dimensional analysis of TDA is a difficult task for classical computers. There are many studies to overcome this issue such as the way of reducing the size of simplicial complexes while keeping their homology unchanged [2]. However, such techniques are not enough to overcome the exponential growth of simplices in high dimensions.
Existing quantum approaches for TDA. Data learning tasks are also actively studied in the field of quantum computation. In order to investigate the quantum computational approaches for TDA, Lloyd, Garnerone, and Zanardi [27] proposed a quantum algorithm (LGZ algorithm) for estimating the (normalized) 1 Betti numbers. The q-th Betti number represents the number of "q-dimensional holes" of a simplicial complex. By estimating the Betti numbers for simplicial complexes that arise from each step of the filtration, the LGZ algorithm can be used to learn the topological structures of data. A function that represents how the Betti numbers vary within the filtration is called a Betti function or a Betti curve [6]. Therefore, the LGZ algorithm can approximately recover the Betti curve of arbitrary dimensions. The LGZ algorithm demonstrates an exponential speedup over the known best classical algorithm for estimating the high-dimensional Betti numbers. A proof of principle experiment of the LGZ algorithm has also been done [24].
Estimating the Betti numbers for each element of the filtration is not enough to fully perform TDA based on persistent homology. To perform such a full TDA, it is crucial to estimate the persistent Betti numbers. The q-th (t, s)-persistent Betti number is the number of q-dimensional holes that exist at scale parameter t that are still alive at s. (For a formal definition, see Section 2.) Because the (t, t)-persistent Betti number is just the usual Betti number, estimating the persistent Betti numbers is a more general task than estimating the Betti numbers. It is shown in [29,31] that the LGZ algorithm can be used to estimate the persistent Betti numbers of the zeroth dimension but fail in higher dimensions in general. The TDA of the zeroth dimension is not the region where we can exploit the benefit of quantum computation because the classical algorithms run in polynomial time for constant dimensions. Therefore, constructing a quantum algorithm for persistent Betti numbers for arbitrary dimensions has been a significant open problem for the quantum TDA.
Our result. We present a first quantum algorithm for estimating the persistent Betti numbers of arbitrary dimensions. We provide error and complexity analysis and the condition for the algorithm to succeed. The formal statement can be seen in Theorem 7 in Section 4. We resolve the open problem of estimating the persistent Betti numbers by estimating the spectral density of the ground states of a certain positive-semidefinite hermitian operator using quantum computation. The positive-semidefinite operator is called persistent combinatorial Laplacian [30,39]. (The persistent combinatorial Laplacian is different from the combinatorial Laplacian, which is used in the LGZ algorithm.) Estimating the persistent Betti numbers allows one to recover the topological summary of data such as the persistent Barcodes.
Comparison with the classical TDA. The best known classical algorithm for computing the persistent Betti numbers of a simplicial pair K → L [30] runs in time O((n L q ) ω ) for ω < 2.373, where n L q is the number of q-simplices of L (assuming n L q = O(n L q+1 ), which holds in many practical cases such as for Vietoris-Rips or Čech complexes). For an approximation task, the best-known classical algorithms for estimating the spectral densities of matrices [7,11,26] runs time linearly with the number of non-zero elements of the matrices. The known classical algorithm for computing the matrix representation of the persistent Laplacian [30] takes O((n L q ) ω + n L q+1 ) time. Therefore, there are no known classical algorithms for computing or approximating the persistent Betti numbers efficiently when n L q is exponential in the number of vertices n.
We present our quantum algorithm using the membership oracle for simplicial complexes. Therefore, it is necessary that we can efficiently construct the membership function so that we can efficiently run the quantum algorithm. Indeed, we can efficiently implement the membership functions for the construction of simplicial complexes such as the Vietoris-Rips complex and the lazy witness complex, as we see in Appendix B. Therefore, our quantum algorithm demonstrates exponential speedup over the best-known classical algorithm for instances of problems that satisfy the promises of Theorem 7.
Nevertheless, the drawback of our result is that we can only estimate the normalized version of the persistent Betti numbers. It is not known whether the estimation of such normalized persistent Betti number is useful or not. It is an important future direction to understand whether there are practical instances of problems that satisfy the conditions of Theorem 7 and whether the estimation of the normalized persistent Betti number is useful or not.
Related Work. The LGZ algorithm [27] estimates the Betti numbers using the quantum phase estimation algorithm. Based on our quantum algorithm for the persistent Betti numbers, we can immediately provide a quantum algorithm for estimating the normalized Betti numbers based on the block-encoding and QSVT by slightly modifying our algorithm (because Betti numbers are special cases of persistent Betti numbers). The gate complexity of implementing a quantum algorithm which outputs 1 with probabilityp 1 s.t.
is O (poly(n) log (1/ )), which is an exponential improvement in terms of the error parameter compared to the LGZ algorithm. (Nonetheless, this exponential improvement does not actually improve the complexity in terms of error due to the sampling error.) Recently, a NISQ algorithm for the Betti numbers is presented in [38]. Their algorithm is based on the following observations: (a) a boundary map can be represented as a sum of Pauli operators; (b) Instead of using Grover's quantum search to implement the uniform mixture of the q-simplex states, a quantum rejection sampling can be used; (c) a stochastic rank estimation can be used to estimate the Betti numbers instead of the quantum phase estimation. Based on these observations, their quantum algorithm achieves a depth complexity that is linear in n at the cost of the success probability 1/d K q of the rejection sampling. It is an open problem to construct a quantum algorithm for the persistent Betti numbers in a way that is preferable for NISQ devices.
Organization. The rest of the paper is organized as follows. In Section 2, we give the preliminaries on persistent homology and TDA. In Section 3, we review the block-encoding and quantum singular value transformation. In Section 4, we provide a quantum algorithm for estimating the persistent Betti numbers, which is our main result. In Section 5 and Section 6, we present the implementation of some unitaries which are required to conclude the proof of the main result.

Preliminaries on persistent homology
In this section, we introduce persistent homology for simplicial complexes. We first introduce some necessary terminologies such as the simplicial complex, the filtration, the Betti numbers, and the persistent Betti numbers. Then we introduce the recent result of [30] that the persistent Betti number can be calculated as the nullity of the so-called persistent Laplacian operator (Theorem 1) and the persistent Laplacian can be implemented using the Schur complement (Theorem 2). Table 1 may be also helpful to overview the notations.
An abstract simplicial complex K over a finite ordered set V of vertices is a collection of subsets of V such that for any σ ∈ K, if τ ⊆ σ then τ ∈ K. An element σ ∈ K is called a q-simplex if |σ| = q + 1. For example, a 0-simplex is called a vertex, and a 1-simplex is called a line. We denote the set of q-simplices of K by S K q . An oriented simplex is a simplex in K with a fixed ordering of vertices inherited from the ordering of V . We denote an oriented simplex of σ ∈ K asσ. LetS K q := {σ : σ ∈ S K q } be a set of oriented q-simplices of K. The q-th chain group C K q of K is the vector space over R with basisS K q 2 . Let n K q := dim(C K q ) = |S K q | be the dimension of C K q . The boundary map ∂ K q : C K q → C K q−1 is defined as where the hat indicates that the vertex v i is omitted. Then, the q-th homology group of K is and the q-th Betti number of K is The q-th combinatorial Laplacian ∆ K q : C K q → C K q is defined as follows: 2 We work with chain groups with coefficients R, which is suitable for our quantum algorithm that relies on the Hodge theory and the persistent Laplacian. The classical algorithms for persistent homology of [42] work for field coefficients such as Z2. The Betti numbers of homology groups over R coefficients do not capture the torsion [25]. Nonetheless, this seems to be enough for many of the situations in TDA.
where (∂ K q ) * : C K q−1 → C K q is the adjoint of ∂ K q under the inner product ·, · C K q such thatS K q is an orthonormal basis of C K q . We have introduced ∆ K q,up , ∆ K q,down for the convenience of notation. The Betti number can be calculated as the nullity of the combinatorial Laplacian [16]: This relationship comes from the following elementary result of Hodge theory:

Claim 1 ([25])
Let A ∈ R m×n and B ∈ R n×p , and assume AB = 0. Then we have Eq.

Filtration and Persistent homology
A sequence of nested subcomplexes of F is called filtration. Similarly, we say a pair of simplicial complexes K and L is a simplicial pair if K and L are the simplicial complexes over the same ordered set V such that K ⊆ L.
We denote a simplicial pair by K → L. For a simplicial pair K → L, we choose the . We obtain a map between the q-th homology group of K and L induced by the inclusion: The q-th persistent homology group H K,L q is defined as the image of the map. The q-th persistent Betti number β K,L q is defined as the rank of the map. The persistent homology group consists of the homology group of K that are still alive at L. That is, Note that the difference of q-th Betti numbers of K and L (β L q − β K q ) does not reveal the same information as β K,L q because β K,L q is the number of topological invariants that exist at K that are still alive at L.

Persistent Laplacian
The persistent Laplacian is first introduced in [39]. Later, the property and implementations of the persistent Laplacian are further studied in [30]. Here, we introduce the q-th persistent Laplacian for a persistent pair K → L. Consider the subspace of Then, let ∂ L,K q be the restriction of the boundary operator ∂ L q to C L,K q so that the image of ∂ L,K q is contained in C K q−1 . Then, the q-th persistent Laplacian is defined as It is shown in [30] that the nullity of the q-th persistent Laplacian equals to the q-th persistent Betti numbers: We provide a proof in Appendix A for completeness.
In addition to the persistent Betti numbers from the nullity of the persistent Laplacian, it is suggested in [39] that the spectral property of the persistent Laplacian beyond the zero eigenvalues would provide useful information about the data.
Matrix representation. We can give a matrix representation of ∂ K q with respect to the canonical basis , the boundary matrix is represented as Then the matrix representation of the persistent Laplacian is In [30], the authors have given two algorithms for computing the matrix representation of ∆ K,L q . We adopt the second algorithm which uses the Schur complement. The Schur complement is defined as follows. It is shown in [30] that ∆ K,L q,up can be implemented using the Schur complement:

Theorem 2 ([30])
Let K → L be a simplicial pair. Assume that n K q < n L q and let

Remark 1 Finding the orthonormal basis of C K q is non-trivial and therefore building B K q
is also non-trivial. If the basis can be found (by matrix reduction, see [30] for detail), we For the quantum algorithm we present in this paper, we do not need to classically compute the matrix representation of ∆ K,L q because we can bypass the necessity of computing the matrix representation of ∆ K,L q by building the block-encoding of ∆ K,L q using the quantum membership functions assisted by the method of [30], which uses the Schur complement.

Correspondence with Quantum computation
We can use the Hilbert space of n-qubits to represent a chain complex of a simplicial complex over n-vertices. We can correspond a q-simplex to an n-bit string of Hamming weight q + 1, where the indices of 1s correspond to the indices of vertices of a simplex. We denote |σ q whereσ q is an n-bit string with Hamming weight q + 1. We denote the n q+1 dimensional Hilbert space in C 2 n spanned by computational basis states of Hamming The superposition of quantum states {|σ q :σ q ∈S K q } is the quantum state representation of the q-chain. The boundary operator maps as ∂ K q : H K q → H K q−1 . A summary of notations is given in Table 1.
The number of vertices The set of oriented q-simplices in K The q-th chain group of K with coefficients R canonical basis of C K q W q Subspace of C 2 n spanned by Hamming weight q + 1 states Block-encoding and quantum singular value transformation In this section, we review the block-encoding of quantum operators and the quantum singular value transformation (QSVT). We refer the reader to [18,28] for more detail. We use · as the spectral norm and · as the diamond norm. The diamond norm is defined as Λ := max ρ (Λ ⊗ I)(ρ) 1 , where · 1 is the trace norm. We also use the notation of O(·) which omits poly(log n) and poly(log(log 1 )) factors.

Block-encoding and QSVT
Let A =ΠU Π, whereΠ, Π are orthogonal projectors and U is a unitary. We call such U a projected unitary encoding of A. Especially, a block-encoding of a quantum operator is defined as follows: Definition 2 (Block-encoding [18]) Suppose that A is an s-qubit operator, α, ∈ R + , and a ∈ N. We say that the (s + a)-qubit unitary U is an (α, a, )-block-encoding of A, if Here, I is the 1-qubit identity operator.
Such encoding is called a block-encoding because U approximates the block matrix It is shown in [18] how to implement a unitary which realizes polynomial transformation of the singular values of the block-encoded matrices. Let A =ΠU Π, d := rank(Π), d := rank(Π) and d min := min(d,d). Then by the singular value decomposition, there exists orthonormal bases where a 1 ≥ a 2 ≥ · · · ≥ a d min . Then, the quantum singular value transformation (QSVT) of A is defined for odd or even polynomial function f : R → C as The following is shown in [18]. This is a slightly modified version of Corollary 18 of [18].
and other single qubit gates, where C Π NOT is defined as We give a proof in Appendix A. The corollary follows when U is a block-encoding of A.

QSVT with some useful polynomials
In this paper, we use QSVT with three different polynomials. First, we introduce the fixed-point amplitude amplification using polynomial approximations of the sign function (Lemma 25 of [18]): Theorem 4 (Fixed-point amplitude amplification [18]) Let U be a unitary and Π be an orthogonal projector such that ΠU |ψ 0 = a|ψ G , and a > δ > 0. There is a unitary circuitŨ such that which uses a single ancilla qubit and consists of We use this theorem for state preparation of our algorithm in Section 5. Next, we introduce the implementation of the Moore-Penrose pseudo-inverse. Supposẽ ΠU Π = A and A = W ΣV † is a singular value decomposition, where Σ is a diagonal matrix with non-negative and non-increasing entries. Then the pseudo-inverse of A is defined as A + := V Σ + W † , where Σ + contains the inverse of the diagonal elements of Σ except for 0, which remains 0. We can implement the pseudo-inverse using the polynomial approximation of 1/x [18]: Theorem 5 (Implementing the Moore-Penrose pseudo-inverse) Suppose that A is an n-qubit positive-semidefinite hermitian operator and U is an (α, a, 0)-block-encoding of A. Assume that the smallest non-zero eigenvalue of A is λ min and let 0 < 1/κ < λ min /α This is a slight modification of Theorem 41 of [18] for the block-encoding of the positivesemidefinite operator. A proof is given in Appendix A. We also use QSVT with a polynomial approximation of the rectangle function in Section 6.2 in order to implement the projector.

Block-measurement
Finally, we review the block-measurement which is introduced in [35]. We can implement a quantum channel that is close to the following map using the approximate block-encoding of Π. Let U Π be the exact block-encoding of Π with m ancilla qubits. Then, consider the following circuit V . Figure 1: The quantum circuit for block-measurement.
The CNOT gate refers to X ⊗ |0 m 0 m | + I ⊗ (I m − |0 m 0 m |) where X is the Pauli-X matrix. It can be seen that V satisfies This means that V is a block-encoding of if U Π is a block-encoding of Π with no error. Let us define the following quantum channel: We also define Λ 0 (ρ) is the quantum channel of (4). Consider the case that what we can implement is not an exact block-encoding of Π but an approximate one. LetṼ andΛ Π be the quantum circuit and channel which are similarly implemented as V and Λ Π usingŨ Π instead of U Π . The block-measurement theorem gives an upper bound on the diamond norm betweenΛ Π and ideal quantum channel Λ 0 : Theorem 6 (Block-measurement [35]) Let Π be a projector, V Π = X ⊗Π+I ⊗(I n −Π) be a unitary and A be a hermitian matrix satisfying Π − A 2 ≤ . Also, A has a block-encodingŨ Π . Then the channel approximates the quantum channel Λ 0 (ρ) in diamond norm:

Quantum algorithm for additive error estimation of normalized Persistent Betti numbers
This section presents a quantum algorithm for estimating the persistent Betti numbers of arbitrary dimensions for a simplicial pair K → L. We assume we have access to quantum membership oracles O K q and O L q+1 that return whether a simplex is contained in K and L for any σ ∈ {0, 1} n and a ∈ {0, 1}, respectively, as For a simplicial complex with n vertices, the number of all possible simplices is exponential in n. However, we can efficiently implement such a membership function for some constructions of the filtration that is commonly used in TDA, as we see in Appendix B. For example, the most straightforward way to implement the membership function for the Vietoris-Rips complex is to use the QRAM access to the n × n size adjacency matrix that represents the connectivity of the points. Moreover, we can efficiently implement the membership function for the Vietoris-Rips complex by using a quantum circuit that returns the adjacency whenever we want because the adjacency matrix is of n × n size.
Our main result is as follows: Theorem 7 Let K → L be a simplicial pair that satisfies the following promises: (P1) K is q-simplex dense: d K q = n K q / n q+1 ∈ Ω(1/poly(n)) (P2) ∆ L q,up (I L K , I L K ) has inverse-polynomial gap: γ q min ∈ Ω(1/poly(n)) (P3) the q-th persistent Laplacian ∆ K,L q has inverse-polynomial gap: λ q min ∈ Ω(1/poly(n)). Then, given access to membership oracles O K q and O L q+1 , there is a quantum algorithm that outputs 1 with probabilityp 1 s.t. for any > 0 and other quantum gates.
As a consequence, we obtain the following corollary to estimate the persistent Betti numbers using the Chernoff-Heoffding inequality. Without the promises (P1)∼(P3), the complexity of our quantum algorithm can be stated as follows: In the following, we give the proof of Theorem 7 and Theorem 8. The proof consists of three parts. First, We propose the quantum algorithm in Section 4.1, Section 4.2, and Section 4.3. (The construction of some unitaries are given in Section 5 and Section 6.) Second, we give the error analysis in Section 4.4. Finally, we give the complexity analysis in Section 4.5.

Description of the quantum algorithm
The quantum circuit used in our algorithm is described in Fig. 2. In Fig. 2,Ũ s is a unitary for state preparation andŨ Π is a unitary that is a block-encoding of the projection onto the low-energy subspace of the persistent Laplacian. The m-qubits (m ∈ O(log n)) are ancilla qubits for the block-encoding. The measurement procedure surrounded by the dashed box is the "block-measurement" procedure [35] which we have reviewed in Section 3.3.

State PreparationŨ s
The state preparation unitaryŨ s is the approximation of the following unitary U s : , which is the uniform superposition over the q-simplices of K. In Section 5, we construct a unitaryŨ s s.t.
using O qn 2 1/d K q log (1/ sign ) -number of gates and O 1/d K q log (1/ sign ) -number of use of O K q . Here, d K q := n K q / n q+1 . We denoteŨ s 0 n+1 := |ψ . We show that by measuring the state prepared byŨ s in the computational basis as in Fig. 2, we can approximately sample from the uniform mixture over the eigenvectors of ∆ K,L q . 3 By measuring the first n-qubit of U s 0 n+1 , the state will collapse to some |σ K q (i) approximately uniform randomly. Repeating the quantum circuit of Fig. 2 and taking samples multiple times from the collapsed state is equivalent to sampling from a state described by a density matrixρ K q which is close to ρ K q := 1 q is also the uniform mixture of the eigenvectors of ∆ K,L q , i.e., ρ K q = 1 3 In the first version of the preprint of this paper [21], we copied the first register to another to obtain 1 n K q i∈[n K q ] |σq(i) ⊗ |1 ⊗ |σq(i) , and then traced out the latter registers to approximately obtain ρ K q . It is pointed out by an anonymous reviewer that this copy and discarding operation can be omitted by measuring in the computational basis and repeating the algorithm a number of times. The copy and discarding operation is required if we use coherently estimate the amplitude using the amplitude estimation [4] for example.
where we denote the eigenvectors of ∆ K,L q as |ψ 1 , ..., |ψ n K q . This is because ∀i, |σ q (i) can be written as |σ q (i) = j U i,j |ψ j for some unknown unitary U and We give an upper bound on ρ K q −ρ K q 1 , which we use in the error analysis later. Using purification,ρ K q can be written as where U copy is a unitary that copies the first n-qubit register to the last n-qubit register, and the subscript s means the last n-qubit register. ρ K q can be similarly written as

Block-measurement with respect toŨ Π
In Section 6, we provide the implementation ofŨ Π which is a (1, m, Π )-block-encoding of Π := j:λ j =0 |ψ j ψ j | assuming that the smallest non-zero eigenvalue of ∆ K,L q is λ q min . We also assume that the smallest non-zero eigenvalue of the submatrix ∆ L q,up (I L K , I L K ) is γ q min . The number of oracles and gates that are used in the construction ofŨ Π is as follows: Here, λ q and γ q are real numbers s.t. 0 < λ q < λ q min and 0 < γ q < γ q min , and rect , inv > 0 are real numbers s.t.
We perform block-measurement with thisŨ Π . Let Λ Π be a quantum channel that maps as This is a quantum channel realized by ideal block-encoding U Π of Π with no error, which we denote Λ Π . From Theorem 6, the quantum channel realized by the block-measurement withŨ Π satisfies We use this inequality in error analysis in the next subsection.

Error analysis
It can be seen that for the ideal state preparation and the ideal block-measurement, the probability of outputting 1 is where we have denoted the m-qubit register as A and other registers as B, and V is the block-measurement circuit of Fig. 1 with U Π . The actual probability that the measurement outcome is 1 in the final measurement of Fig. 2 is whereṼ is the block-measurement circuit of Fig. 1 The difference between the ideal probability and the actual probability can be calculated as follows: where we have used (6) and (7).

Analysis of the number of gates and the use of the oracles.
The total number of gates and use of oracles can be summarized as follows: If we want to make |p 1 −p 1 | ≤ , we can take Π ∈ O( ), sign ∈ O( ), rect ∈ O( ) and Therefore, if d K q = n K q / n q+1 ∈ Ω(1/poly(n)), γ q min ∈ Ω(1/poly(n)) and λ q min ∈ Ω(1/poly(n)), the quantum circuit usesÕ poly(n) log 1 2 number of O K q , O L q+1 and other quantum gates.

Implementation of the state preparation unitary
In this section, we provide the implementation of the state preparation unitaryŨ s of eq. (5). In order to implementŨ s , we use a unitary P q s.t.
where W q is the set of Hamming weight q + 1 n-bit strings. In [19], the combinatorial number system [36] is used to implement such a unitary. The uniform superposition of the Hamming weight k states is also known as the Dicke state [12]. Instead of following the construction of [19] using the combinatorial number system, we can use the construction of [3] for the Dicke state. Then, we can implement P q with O(qn) gates and O(n) depth without using any ancilla qubits. (The gate complexity of [19] is similar, but it additionally requiresÕ(qn 2 ) time of computation for preparing a look-up table.) Then by adding a qubit and applying the membership oracle O K q to the quantum state of (8), we get We call d K q the q-simplex density of K. LetΠ = I n ⊗ |1 1|, Π = 0 n+1 0 n+1 and CΠNOT, C Π NOT, and single qubit gates. Each of the C Π NOT gates, which are the generalized Toffoli gates, can be decomposed into O(n) number of elementary gates with an ancilla qubit [23]. Therefore,Ũ s can be implemented 6 Implementation of the block-encoding of the projectorŨ Π In this section, we give a construction ofŨ Π which is a (1, m, Π )-block-encoding of Π. We first show the implementation of the block-encoding of the persistent Laplacian in Section 6.1. Then we show the construction of a (1, m, Π )-block-encoding of Π in Section 6.2 using the block-encoding of ∆ K,L q . The implementation ofŨ Π includes the inversion of matrices. More precisely, our algorithm includes the implementation of the Schur complement of a matrix that contains the implementation of the Moore-Penrose pseudo-inverse. The implementation of the Schur complement may be of independent interest. The fact that our algorithm includes the matrix inversion implies that it shares the characteristics of both the HHL algorithm [20] and the LGZ algorithm.
6.1 Implementation of the block-encoding of ∆ K q We provide the procedure of implementing the block-encoding of ∆ K q in the following.
Letñ := 2 log n , andq := 2 log (q+1) . In Appendix C, we construct a unitary U K q which is an (ñq, a, 0)-block-encoding of ∂ K q . U K q is implemented usingÕ(n 2 )-number of gates and O(1)-use of O K q . Here a = O(log(n)). It follows that (U K q ) † is an (ñq, a, 0)-block-encoding of (∂ K q ) * because the matrix representation of (∂ K q ) * is the transpose of the matrix representation of ∂ K q whose elements are real. We can implement the block-encoding of ∆ K q,down using the following lemma: A, and V is a (β, b, )-block-encoding of a s-qubit

Lemma 1 (Product of block-encoded matrices (Lemma 53 of [18])) If U is an (α, a, δ)block-encoding of an s-qubit operator
Using this lemma, we can implement a unitary which is an (ñ 2q2 , 2a, 0)-block-encoding of ∆ K q,down . From the construction above, V K q,down can be implemented withÕ(n 2 ) gates and O(1) use of O K q .
6.1.2 Implementation of the submatrices of ∆ L q,up and its Schur complement Let us define the submatrices for convenience. By Theorem 2, Let us also define q := 2 log (q+2) . We can similarly implement a unitary V L q,up which is a (ñ 2 q 2 , 2b, 0)-block-encoding of ∆ L q,up : H L q → H L q where b = O(log(n)) with usingÕ(n 2 ) number of gates and O(1) number of O L q+1 . In order to implement the Schur complement ∆ L q,up /∆ L q,up (I L K , I L K ), we need to access the submatrices of ∆ L q,up . The membership oracle O K q : |x |0 → |x f K q (x) can be used to access those submatrices. LetÕ K q : |x |0 → |x f K q (x) ⊕ 1 . Then, let us introduce By applyingÕ K q and postselecting the ancilla register to |0 , we can restrict the space to that is spanned by Similarly, by applying O K q with postselecting the ancilla register to |0 , we can restrict the space to that is not spanned by {|σ K q (i) } i∈[n K q ] . Therefore, V 1 , V 2 , V 3 and V 4 are (ñ 2 q 2 , 2b + 2, 0)-block-encodings of ∆ 1 , ∆ 2 , ∆ 3 and ∆ 4 , respectively. V 1 ∼ V 4 can be implemented usingÕ(n 2 )-gates and O(1)-use of O K q and O L q+1 . Let γ q min be the smallest non-zero eigenvalue of ∆ 4 and let κ =ñ for γ q s.t. 0 < γ q < γ q min . Then, using Theorem 5, we can implement a ( 2 γq , 2b + 3, inv )-block-encoding of ) other elementary gates. We can implement a unitary V + which is a ( 2ñ 4 q 4 γq , 6b + 7,ñ 4 q 4 inv )-block-encoding of ∆ 2 ∆ + 4 ∆ 3 using Lemma 1. . However, we do not need to know the values of n k q and n L q to run our algorithm because we can implement the block-encoding of submatrices by V 1 , V 2 , V 3 and V 4 using the membership oracles. The membership oracles have been used to effectively implement the restrictions to the corresponding subspaces of I L k coherently, bypassing the need to classically learn the values of I L k .

Linear combination of block-encoding unitaries
Finally, we linearly combine the block-encoding (BE) of ∆ 1 , ∆ 2 ∆ + 4 ∆ 3 and ∆ K q,down . We implement such a linear combination in a way similar to Lemma 52 of [18]. What we have prepared so far are the following unitaries: where we have introduced U 0 , U 1 , U 2 and α 0 , α 1 , α 2 for the convenience of notation. What we implement is a block-encoding of Let β = α 0 + α 1 + α 2 and P R , P L be state preparation unitaries such that Suppose W is a unitary such that for all j ∈ 0, 1, 2 and any state |ψ , . Then, this satisfies andW is a (β, 6b + 9,ñ 4 q 4 inv )-block-encoding of ∆ K,L q , and β ∈ O( q 4 n 4 γq ). Let us denote

Implementing the block-encoding of the projector
Using the block-encoding of ∆ K,L q , we implement the block-encoding of the projector onto the zero energy space of ∆ K,L q : Π = j:λ j =0 |ψ j ψ j |. Let λ q min be the smallest nonzero eigenvalue of ∆ K,L q . In [18], the following polynomial approximation of the rectangle function is shown: We take t = λ q min 2β and δ = λq 2β for such λ q that satisfies 0 < λ q < λ q min . Because the blockencoding of ∆ K,L q is not exact, we need to introduce the following lemma on the robustness of singular value transformation:

Lemma 3 (Lemma 22 of [18]) If P ∈ C[x] is a degree-d polynomial satisfying the following conditions:
• P has parity-(d mod 2), • if d is even, then ∀x ∈ R : P (ix)P * (ix) ≥ 1 and moreover, A,Ã ∈ Cñ ×n are matrices of operator norm at most 1, then we have that Using Lemma 2 and Corollary 1, we can implement a unitaryŨ Π which is a QSVT of ∆ K,L q /β with respect to the rectangle function: Using Lemma 3 it can be seen that Therefore,Ũ Π is a (1, m := 6b + 10, Π := rect + 4d ñ 4 q 4 inv /β)-block-encoding of Π. It can be seen that The number of oracles and gates that are used in the construction ofŨ Π is as follows: [ A Proofs of Section 2 and 3 Proof of Theorem 1. Choose any orthonormal basis of C L,K q+1 and let B L,K q+1 be the corresponding matrix representation of ∂ L,K q+1 . Then . It follows using Claim 1 that Proof of Theorem 3. This is a slight modification of Corollary 18 of [18]. Using Corollary 10 of [18], we can find a degree-d polynomial 1] and P satisfies the following conditions: • P has parity-(d mod 2), with an O(poly(d, log(1/δ)))-time classical algorithm. By Theorem 17 of [18], we can also find a parameter Φ = {φ 0 , φ 1 , ..., φ d } ∈ R d+1 in classical O(poly(d, log(1/δ))) time s.t. the parametrized circuit U Φ satisfies Here, U Φ is a unitary defined as Each of e iφ(2Π−I) can be implemented using a single ancilla qubit and C Π NOT as We can linearly combine U Φ and U −Φ by applying Hadamard gates to the ancilla qubit as where p means to postselect to |0 . Define U Φ + for odd d and even d as the above figure.
Let Π =Π for odd d and let Π = Π for even d. Then P (SV) (ΠU Π) = Π U Φ Π and P * (SV) (ΠU Π) = Π U −Φ Π, and This satisfies It is clear from the construction that U Φ + can be implemented with O(d) use of U , U † , C Π NOT, CΠNOT and single-qubit gates.
Proof of Corollary 1. This follows from the construction of U Φ + in the proof of Theorem 3.
Proof of Theorem 5. This is a slight modification of Theorem 41 of [18] for the blockencoding of the positive semidefinite operator. Let = α 2 . It is shown in [18,28] that there exists an odd real polynomial f (x) that 2κ -approximates the function 1 2κx for ∀x ∈ By Corollary 1, we can construct U Φ + that is a QSVT with polynomial P (x) that 2κapproximates f (x) with O(poly(κ log (κ/ )))-time classical algorithm. Then, because λ min /α > 1/κ, U Φ + satisfies

B Explicit construction of the membership oracles
It is necessary to be able to construct the membership functions so that we can implement our quantum algorithm efficiently. Although the number of possible q-simplices can be exponential in the size of the data, we can implement the membership function if we can efficiently verify whether a simplex is contained at some point of the filtration. There are many ways of constructing simplicial complexes from the given data, such as the point cloud, the digital image, or the network [32]. A q-skeleton of a simplicial complex K is a union of K p for p = {0, 1, ..., q} where K p is the collection of the p-simplices of K. We can construct the membership functions efficiently for a class of the simplicial complex which is entirely determined by its 1-skeleton such as the flag complex, clique complex, Vietoris-Rips complex, and lazy witness complex. Besides such simplicial complexes, we are not sure whether we can efficiently implement the membership functions such as the Čech complex and the Alpha complex. In the following, we introduce some constructions of the membership functions.
Vietoris-Rips complex. Vietoris-Rips (VR) complex is very commonly used in TDA. This is defined as follows. Let S be a set of n points in R d . Then, the VR complex with parameter ε denoted by R ε (S) is the set of all σ ∈ S such that the largest distance between any of its points is at most 2ε. We can construct a VR complex first by computing the graph by connecting any of the vertices of which pairwise distance is less than 2ε. Then, the VR complex is the clique complex of that graph.
In order to check whether a set of vertices is contained in the VR complex, we just need to check all of the pairwise distances of the vertices are less than 2ε, which can be done in O(n 2 ) time in the worst-case.
It is also possible to implement the membership function for the Vietoris-Rips filtration of the network input [1]. There, an undirected weighted graph is given as input. Then, the filtration is built according to the parameter for the weight of the edges. The membership verification can be done similarly using the adjacency matrix of the undirected graph with some parameters.

Witness complex.
A small subset of the point cloud is used as a 'landmark' to reduce the number of simplices. Let S be the point set and let L ⊆ S be the landmark set. Then, the number of possible simplices of the witness complex is 2 O|L| . A point s ∈ S is called a weak ε-witness for σ iff d(s, a) ≤ d(s, b) for all a ∈ σ and b ∈ L \ σ where d(x, y) is the distance points x and y. The weak witness complex at scale ε is the simplicial complex with vertex L such that all the faces of that simplicial complex σ ⊂ L have a weak εwitness in S. The lazy weak witness complex at scale ε has the same 1-skeleton as the weak witness complex. We can efficiently verify membership for the lazy weak witness complex by calculating whether any pair of two points in L has a weak ε-witness.
C Construction of the block-encoding of the boundary operator C.1 Introduction of U K q As in eq. (1), the boundary matrix is defined as In this section, we construct a unitary U K q that acts as

C.2 Verification that U K q is the block-encoding of the boundary operator
We show that the unitary U K q satisfies the following equations. First, we show that for ∀x ∈S K q ⊆ {0, 1} n , the unitary U K q satisfies with some a = O(log(n)). Second, we show that for ∀x / ∈S K q , the unitary U K q satisfies ( 0| a ⊗ I n )U K q (|0 a ⊗ I n ) |x = 0.
Here, t i (x) is the index of the i-th nonzero bit of x,ñ := 2 log n , andq := 2 log (q+1) . Such U K q is an (ñq, a, 0)-block-encoding of ∂ K q . By projecting the fourth register of eq. (9) to 0 5 , we obtain Therefore, the unitary U K q satisfies eq. (10) and eq. (11).

C.3 Construction of U K q
We construct the unitary U K q in the following 5 steps.
1. Generate the following state: 1 √ñqq This state can be generated by applying H ⊗ log (q+1) and H ⊗ log n to the second and the third registers of |x ⊗ 0 ⊗ log (q+1) ⊗ 0 ⊗ log n ⊗ 0 5 , respectively.
2. Generate the following state: |x |s |t |h q−1 (s), h n (t), δ s,gt(x) ⊕ 1, where h a (x) = 0 if x < a and h a (x) = 1 if x ≥ a, g t (x) = t−1 i=0 x i (x i is the i-th bit of x), and δ s,gt(x) = 1 if s = g t (x) and δ s,gt(x) = 0 otherwise. The computed functions |h q−1 (s), h n (t), δ s,gt(x) ⊕ 1, x t ⊕ 1, f K q (x) ⊕ 1 in the fourth registers are projected to 0 5 as in eq. (12). By this projection, we can only leave the sums of s and t which are needed as the boundary operator, which is the motivation for computing these functions.
The generation of the quantum state of eq. (14) can be done as follows.
• We first compute h q−1 (s) in the first qubit of the fourth register of eq. (13).
For computing h q−1 (s), we first prepare |q in an ancilla register and compute whether s ≤ q −1 or s > q using a quantum circuit for subtraction with O(log n) gates [22]. Finally, initialize the ancilla register.
• Similarly, compute h n (t) in the second qubit of the fourth register.
• Compute g t (x) in ancilla qubits as 1 √ñqq This can be done using the quantum circuit for addition [10], which uses another single ancilla qubit andÕ(n 2 )-gates.
• Initialize the ancilla register: • Copy the t-th bit of x to the fourth qubit of the fourth register and also apply X gate to the same qubit: |x |s |t |h q−1 (s), h n (t), δ s,gt(x) ⊕ 1, x t ⊕ 1, 0 .
• Apply the membership oracle O K q to the first register and the fifth qubit of the fourth register, and also apply X-gate to the fifth qubit of the fourth register: |x |s |t |h q−1 (s), h n (t), δ s,gt(x) ⊕ 1, x t ⊕ 1, f K q (x) ⊕ 1 .
We have implemented the unitary of eq. (9) in the above 5 steps. We have usedÕ(n 2 )number of gates and O(1)-number of O K q in this construction.