Quantum Motif Clustering

We present three quantum algorithms for clustering graphs based on higher-order patterns, known as motif clustering. One uses a straightforward application of Grover search, the other two make use of quantum approximate counting, and all of them obtain square-root like speedups over the fastest classical algorithms in various settings. In order to use approximate counting in the context of clustering, we show that for general weighted graphs the performance of spectral clustering is mostly left unchanged by the presence of constant (relative) errors on the edge weights. Finally, we extend the original analysis of motif clustering in order to better understand the role of multiple `anchor nodes' in motifs and the types of relationships that this method of clustering can and cannot capture.


Introduction
The study of complex networks has impacted many fields of science [Str01], including biology [Alb05,SOMMA02], sociology [WF94], neuroscience [BS17], and finance [AOTS15,GK10]. In particular, it is commonplace to study the connectivity patterns of networks at the edge and vertex level in order to uncover important structures in the underlying data. One method that provides insight into the connectivity structure of a network is graph clustering, which entails finding groups of highly connected vertices in order to uncover underlying community structures. There are many efficient (heuristic) algorithms for graph clustering, including the theoretically well-motivated k-means spectral clustering 1 . Here, given an integer k, the eigenvectors corresponding to the smallest k eigenvalues of the graph Laplacian are used as a feature set for a k-means clustering algorithm. It has been shown that in certain circumstances spectral clustering leads to the discovery of optimal graph partitions [PSZ15].
Recently, it is becoming popular to study more sophisticated connectivity patterns. This can be done in the context of, for example, hypergraphs 2 that can express multiple-vertex relationships, or via small subgraphs, also known as motifs, which can be used to study higher-order connectivity patterns between vertices. The latter has become a useful tool for providing deeper insight into a network's function and structure, although often the detection of these motifs remains computationally challenging [MNSK12].
In [BGL16], Benson et al. propose an algorithm for clustering a graph based on its motif connectivity. Their algorithm, which makes use of spectral clustering and therefore comes with theoretical guarantees [PSZ15], can be used to uncover collections of vertices that are highly connected via particular motifs, rather than just by edges as in the ordinary case. The authors apply their technique to the well known C. elegans neuronal network and to a transportation reachability network, with a particular motif used for each case, and find that the motif clustering reveals network organisation not made apparent by clustering through edge-based connectivity alone.
As a motivating (toy) example, consider the graph shown in the middle of Figure 1, which could for example represent a financial transaction network: vertices corresponding to financial entities, and unweighted edges between them denoting transactions (for example, of a value beyond a particular threshold). Consider also the motif shown at the top of Figure 1, which represents the situation wherein two entities, given by the anchor nodes shown in dark red, trade indirectly through three intermediate entities according with edge pattern of the motif. Suppose that we are interested in clustering the nodes of the graph into groups that don't trade with each other directly, but instead do so only by means of intermediate nodes in accordance to the structure given by the motif. The method of motif clustering achieves precisely such a clustering.
As shown by [BGL16], obtaining a motif clustering of the original graph can be done in two steps. First, we construct the motif graph, displayed in the bottom of Figure 1. This graph has the same vertex set as the original graph, but a different edge set: any instance of the motif in the original graph corresponds to an edge in the motif graph connecting the two anchor nodes of the motif instance in question. All edges of the motif graph have integer weights that correspond to the number of motif instances in the original graph that have the two edge endpoints as anchor nodes. In the second step we use k-means spectral clustering on the motif graph to obtain the required motif clustering of the original graph.
In this paper we present three quantum algorithms that can perform motif clustering faster than the classical algorithm presented in [BGL16]. The majority of the quantum speedup comes from faster finding and/or counting of motifs in the graph, a task that is often computationally demanding. Our speedups are of the Grover variety: at most quadratic, and sometimes less, depending on the choice of motif and the sparsity of the graph. Our reason for presenting several quantum algorithms is that we have a choice between using Grover search or quantum approximate counting, as well as the option of constructing the motif graph in its entirety, or only giving query access to it. Depending on the input graph, one will be favorable over the other. We should add that, as argued in [BMN + 21], for quantum algorithms based on Grover-type speedups to become practically interesting, substantial improvement in qubit counts, physical gate errors and/or error correction schemes are required.
We also prove some technical lemmas related to performing spectral clustering in the presence of errors on the weights of the edges of a graph -this is the case in our application when the number of motifs connecting two vertices is estimated using quantum approximate counting -which might be of independent interest. Along the way, we give a simple 'nogo' argument to show that spectral clustering with errors on normalized Laplacians does not come with the same guarantees as in the unnormalized case, but nevertheless, numerical experiments suggest that it still performs well in practice. An interesting open question is whether we can turn this empirical observation into a theoretical one.
Finally, in the Appendix we discuss the role of anchor nodes in motifs, extending the analysis of Benson et al. More specifically, we argue that motif clustering should be used only for two-anchor node motifs, which express pairwise relationships between vertices. If, instead, we want to cluster using relationships between more than two vertices -which is what motifs with more than two anchor nodes attempt to capture -we should do so within the context of hypergraphs rather than that of motif clustering.
Organisation After introducing some notation in Section 2, we begin by explaining the concept of motif clustering and discuss previous work in the (classical) literature in Section 3. Following this, we summarise our main results in Section 4. Section 5 describes a classical algorithm for motif clustering, and introduces the notation that we use throughout the rest of the paper. In Section 6 we introduce our quantum tools and use those to construct our three quantum algorithms: one based on Grover search, the other two on quantum approximate counting. Finally, in Section 7 we consider the effect of quantum approximate counting, both analytically and numerically, on the guarantees that come with spectral clustering. In the Appendix we discuss in detail the role of anchor nodes in motifs.

Preliminaries
Before we discuss the concept of motif clustering, we first introduce some notation. In addition, when comparing how well our quantum algorithms perform relative to their classical counterparts, we want to be able to talk about their run-times. In the section below, we make precise what we mean with run-time.

Notation
For an integer k ≥ 1, we write [k] := {1, . . . , k}. For a set W , we denote its size by |W |. We write G = (V, E) for a directed graph with vertex set V and edge set E, where n = |V | denotes the number of vertices, and m = |E| the number of edges, and assume a fixed ordering of the vertices in V that allows for a natural identification of V with [n].
For v ∈ V , we define d v to be the degree of v and d := max v∈V d v the maximum degree of any vertex in the graph. We use A to denote the adjacency matrix of the graph, and for each vertex v assume that we know d v and that we have query access to the weighted adjacency list J v : [d v ] → [n] × R ≥0 , which is a function that assigns labels and weights to the neighbours of v. We will call such access 'adjacency list access' to G. For a subset W ⊂ V , we write W for the complement, i.e. W = V \ W . A k-partition of V is a collection of pairwise disjoint subsets W 1 , . . . , W k ⊂ V such that V = ∪ i∈ [k] W i .
We will often consider the Laplacian L of an n-vertex graph G, and its normalized equivalent, the normalized LaplacianL. Let D be the diagonal n × n matrix of (weighted) vertex degrees in G, and A the adjacency matrix. Then the Laplacian is defined as L := D − A, and the normalized Laplacian asL := D − 1 2 LD − 1 2 = I − D − 1 2 AD − 1 2 (which is only defined for graphs for which every vertex has a positive degree).
Finally, given any n × n real symmetric matrix, such as the graph Laplacian, we assume that the eigenvalues λ 1 ≤ λ 2 ≤ · · · ≤ λ n are ordered by increasing value and denote by v 1 , v 2 , . . . , v n the corresponding eigenvectors, which we assume to be normalized. In particular, when we mention the 'first k eigenvectors' of a matrix, we are referring to the eigenvectors corresponding to the smallest k eigenvalues. For graph Laplacians, which are positive semidefinite, the smallest eigenvalues will be those closest to zero.

Query and time complexity
All our quantum algorithms assume coherent access to the input graph in the form of quantum queries to the adjacency lists. More explicitly, given the maps {J v : v ∈ V } introduced above, coherent access to the adjacency lists means that we have access to the following unitary and its inverse, where v ∈ V , i ∈ [d] and |J v (i)⟩ contains two registers, one for the label of i-th neighbor of v, and one for the weight of the edge. When we talk about the query complexity of an algorithm, we mean the number of times the algorithm applies the unitary in Eq. (1).
If the adjacency lists are sorted (according to some ordering of the vertices in V ), then the only type of access to the input graph our algorithms require is this 3 . If we are not provided with coherent access to the adjacency lists, or they are not sorted, then we must provide our own (perhaps sorted) coherent access, which will require classically writing to a QRAM, using at most O(nd) operations. We note that since the run-times of our quantum algorithms are larger than O(nd), our speedups persist even if we pay this cost up front.
Finally, by the run-time of a quantum algorithm we mean the total number of elementary gates, QRAM writes, and queries made by the algorithm. Our definition of run-time is the same as that of Apers and de Wolf [AdW20].

Motif clustering
The idea behind motif clustering is to partition the vertices of a graph into several clusters based on a higher-order structural pattern, called a motif. The partitions obtained through motif clustering should be such that any two vertices within a particular cluster are part of relatively many connected occurrences of the motif in the graph, whereas two vertices in different clusters should participate in relatively few connected motif occurrences. This statement will be made precise below. An example of motif clustering for a particular motif is shown in Figure 1.
In this section, which is based on the content of Benson et al. [BGL16], we set the stage by introducing the reader to the necessary concepts and definitions that we use throughout the paper.

Graph motifs
of size s is a connected, unweighted graph with s-sized vertex set V M , and edge set E M . Throughout this paper we assume that s is a constant. The motif comes with a set of anchor nodes V A ⊆ V M , which will become relevant when we discuss motif cuts.
Given a particular motif M and an unweighted graph G = (V, E), we will be interested in occurrences of the motif M in G, which can be functional or structural [SK04]. Formally, a motif assignment is an injective map ι : V M → V . For functional motifs, we require that That is, any two vertices in ι(V M ) should have an edge in G whenever the corresponding vertices in the motif have one, but there can be additional edges in G not present in the motif itself. For structural motifs, we have that (ι(v), ι(w)) ∈ E if and only if (v, w) ∈ E M for all v, w ∈ V M , and therefore the motif is graph-isomorphic to ι(V M ) (i.e. both the edges and non-edges coincide).
Because we are interested in the motif occurring in G as a pattern irrespective of the actual vertex assignment given by the mapping ι, we next define an equivalence relation on the set of motif assignments. Two motif assignments ι and ι ′ are considered equivalent if ι(V M ) = ι ′ (V M ) as sets and also ι(V A ) = ι ′ (V A ) as sets. A single equivalence class is called a motif instance. We write I = I(G, M ) for the set of all motif instances of M in G. Moreover, we say that a motif instance has a vertex u ∈ V as an anchor node if, for any assignment ι in the equivalence class of the instance, u ∈ ι(V A ). Note that this definition is well-defined, as it does not depend on the choice of assignment ι. In Appendix A, we further elaborate on when two motif assignments are equivalent, and how this equivalence is related to symmetries of the motif itself.
Note that, motif clustering can be applied to both directed and undirected (but unweighted) graphs. When the graph is (un)directed, the motif should also be (un)directed for motif instances to exist within the graph.
Following [BGL16], we focus on structural motifs in this work. Note that a functional motif can be thought of as a combination of several structural motifs 4 . Since the framework introduced by Benson et al. [BGL16] can be extended to consider several motifs simultaneously -see Appendix D.1, and the same extensions work for our framework, we also capture the case of functional motifs.

Motif cuts
A common method for clustering graphs is to find minimal normalized cuts. This corresponds to finding a partition of the vertex set that minimizes the total weight of the cuts (edges connecting different partitions) whilst maximizing the volumes or sizes of the partitions. The number of partitions (often denoted k) is fixed in advance. Motif clustering works analogously, in the sense that it minimizes the number of motif cuts whilst maximizing the motif volume or partition size.
More formally, let W ⊂ V be a subset of vertices in the graph. An ordinary graph cut with respect to W and its complementW is given by the number of edges that have one endpoint in W and the other inW . Similarly, we could define a motif cut with respect to W andW to be the number of motif instances that have one or more vertices in W and one or more vertices inW . However, following [BGL16], we want to incorporate the idea that each individual motif instance signifies a mutual relationship between a specific subset of vertices of the motif, called anchor nodes. As such, given a motif M , a motif cut in G with respect to W and its complementW , denoted by cut (G,M ) (W ), is defined to be the number of motif instances that have at least one anchor node in both W andW : (2) Moreover, analogous to the ordinary volume given by the sum of all degrees of vertices in a given set, define the motif volume vol (G,M ) (W ) of W to be the number of anchor nodes in motif instances that appear in W : Given the notions of motif cut and motif volume, the motif conductance and the motif ratio cut of the set W are given by 5 Given some integer k ∈ N chosen in advance, we can also define the conductance and ratio cut relative to a k-partition W 1 , . . . , W k . The motif conductance and motif ratio cut of the respectively. The goal of motif spectral clustering, for a fixed k ∈ N chosen in advance, is to find a . . , W k ): this will result in a partitioning of the graph into (k) clusters of vertices that are highly connected via the target motif, whilst very few motifs connect vertices in different clusters. It turns out that, for motifs with two or three anchor nodes, one can translate the two minimization problems above to the problems of ordinary conductance or ratio cut minimization of an auxiliary, weighted graph, called the motif graph [BGL16] which we introduce in the next section.

The motif graph
Benson et al. [BGL16] show that for motifs with two or three anchor nodes, minimizing ϕ (G,M ) (W 1 , . . . , W k ) is equivalent to minimizing the ordinary conductance on a weighted graph G = (V, E, A) that can be constructed from the graph G, which we term the motif graph of G given motif M . The graph G has the same vertex set 6 as G, but in general a different set of edges, which are now integer weighted. Also, whereas both G and M can be directed, G is always an undirected graph. For notation we will use ordinary characters G, V , E, etc. when referring to the original graph G, and calligraphic characters G, E, etc. to refer to the motif graph G.
Given G and M , the edge set E of G is given by i.e. two vertices u, v ∈ V are connected by an edge in G if they are both anchor nodes of a motif instance ι ∈ I of G. The motif weighted adjacency matrix A of G has integer coefficients given by i.e. A uv is equal to the number of motif instances in G that contain both u and v as anchor nodes. For u ∈ V , we define the motif degree d u of u to be the total number of edges in G connected to u, d u := |{v ∈ V : A uv > 0}|, 6 For practical details regarding entirely disconnected vertices aside, see Section 5.2.3. and the motif strength s u to be the sum of all weights of edges connected to u: A uv the cut induced by W in G. The conductance of and ratio cut of W in G are given by and for the conductance and ratio cut of the partition in G.
Benson et al. prove the following results relating motif conductances and volumes in G to ordinary conductances and volumes in G: with |V A | ≥ 2 anchor nodes 7 , G the motif graph constructed from G and M , and W ⊂ V .
The proof for |V A | = 2 is not given in [BGL16], but it can be proven in exactly the same way as the proof for the |V A | = 3 case, see Appendix B for details. Why the second equation above only holds for motifs with two or three anchor nodes is discussed extensively in Appendix D.
For k ∈ N, write P k (V ) for the set of all k-partitions of V . The following corollary is immediate from Lemma 1.
with |V A | ∈ {2, 3} anchor nodes and G the motif graph constructed from G and M . Then, for and and arg min In particular, for motifs with two or three anchor nodes, in order to find partitions of V that have small motif conductance or small motif ratio cut in G, we can instead solve the equivalent problem of finding partitions of V that have small ordinary conductance or ratio cut in G respectively. In order to obtain a partition that (approximately) minimizes the conductance or the ratio cut of G, [BGL16] uses k-means spectral clustering on the motif graph G. We will describe how to do this in detail in Section 5. Thereafter, we will discuss how we can improve the classical algorithm using quantum algorithmic methods in Section 6. We begin by first stating our results in Section 4.

Results
Searching for a motif in a graph is essentially an unstructured search problem. As such, we can speed up the parts of the classical algorithm that construct the motif graph by applying either Grover search or quantum counting. Which approach will be faster will depend on the properties of the input graph, and also affect what type of clustering we can employ. Using Grover search we can construct the motif graph exactly, and then apply spectral clustering to its unnormalized or normalized Laplacian while having guarantees on its behaviour. Using quantum approximate counting, on the other hand, can be faster than using Grover search but only approximately constructs the motif graph. As we show in Section 7, in this case we only have guaranteed behaviour when spectral clustering is applied to the unnormalized Laplacian. Moreover, rather than pre-compute the entire motif graph G, for instance by explicitly writing down the motif adjacency matrix or motif adjacency lists, in some cases it can be more efficient to provide query access to it via some subroutine. Taking the above considerations into account, we present below three quantum algorithms for motif spectral clustering that give speedups over the best classical algorithm in various situations. The complexities of the (quantum and classical) algorithms considered in this work are dominated by the time required to compute the edges and weights of the motif graph. The first two of our quantum algorithms focus on constructing the entire motif graph before applying some classical or quantum algorithm for spectral clustering; the third provides query access to the motif graph's adjacency lists, and then uses a fast quantum spectral clustering algorithm based on quantum graph sparsification by Apers and de Wolf [AdW20].
Input graph Let G = (V, E) be the input graph that we want to cluster according to some s-vertex motif M . Since s = 2 corresponds to an edge, without loss of generality we can and will assume throughout the rest of the paper that s ≥ 3. We write n = |V | for the number of vertices of G and d for the maximum degree (which can be n). We assume that we have adjacency list access to G. In addition, we will only consider constant-sized motifs, meaning that s is independent of n.
For the first quantum algorithm presented below, the run-time depends on whether the adjacency lists of the input graph are sorted (according to some chosen ordering of all vertices in V ) or not. 8 For the other two, we can sort the adjacency lists of the input graph beforehand without affecting their (asymptotic) run-times, and then use the algorithms as described in Section 6, which assume that access to the input graph is provided via sorted adjacency lists.
Our quantum algorithms require coherent access to the input graph. If, rather than coherent access, we are given classical access to the input graph, we will first need to load the input graph into QRAM in time O(nd). In addition, sorting the adjacency lists takes time 9 O(nd). If we either have to sort, load to QRAM, or both, we say that we need to pre-process the input graph. In Appendix C we discuss how pre-processing the input graph affects the run-times of our algorithms.

Algorithms and their run-times
Below we provide the complexities for constructing the motif graph and obtaining the k eigenvectors corresponding to the smallest k eigenvalues of the motif graph Laplacian, where the latter are then used as input to k-means clustering. k-means clustering itself is a heuristic algorithm, with an exponential upper bound to its run-time, but in practice is usually significantly faster than this. For 'well-clusterable' graphs, the k-means part of the algorithm can be done in nearly-linear time [PSZ15]. For simplicity we will denote the time it takes to run k-means by T k-means , and note that generally this will not be the most expensive part of the algorithms.
for the entire k-means motif spectral clustering algorithm. This is essentially 10 optimal for any algorithm that makes use of the motif Laplacian, since to construct the motif graph exactly one needs to have counted all motif instances in the graph, which can be as large as nd s−1 , and hence counting these classically requires Ω(nd s−1 ) queries to the input graph via standard lower bounds on the query complexity of counting.
Quantum via Grover search Our first quantum algorithm uses Grover search plus classical subroutines to find all motif instances in the graph and compute the weights of the edges in the motif graph exactly, before applying the classical spectral clustering based on the spectral estimation algorithm of [ST14].
Theorem 1 (Motif clustering via Grover search). Given a graph G with maximum degree d and a motif M of size s, there exist quantum algorithms for exact motif clustering under the following conditions and with the following run-times: 1. If we do not have coherent access to the input graph, then there is a quantum algorithm for motif clustering with expected run-timẽ where M is the total number of motif instances in the graph G.

If we have coherent access to the input graph, and the adjacency lists are sorted, then there is an algorithm that takes timẽ
3. If the adjacency lists are not sorted, then they can be pre-sortedto yield a quantum algorithm with the same run-time as given in Eq. (5). Otherwise, there is a quantum algorithm with expected run-timeÕ The run-times of Algorithms 1 and 2 above are analysed in Section 6.2. Given coherent access to the input graph, which of Algorithms 2 or 3 to use depends on M . The second is faster in case there are relatively few motif instances M in total, i.e. when M = o n d s−2 .
If we lack any knowledge of a non-trivial bound on M , the sensible choice is to first sort all adjacency lists, since the algorithm so obtained is never slower than its classical counterpart.
In general, the best upper bound we can put on M is nd s−1 , in which case the quantum algorithm runs in timeÕ(nd s−1 ) -no better than classical. Hence, this algorithm provides a speedup whenever M = o(nd s−1 ); we later show that for scale-free networks, which occur often in practice, this is indeed the case. The advantage of this algorithm is that by constructing the motif graph exactly, it can be used to perform spectral clustering using the eigenvectors of the ordinary as well as the normalized Laplacian -see Section 5 below for a detailed discussion on the difference between clustering with the Laplacian or its normalized counterpart.
Quantum via approximate counting and classical spectral clustering Our second quantum algorithm uses quantum approximate counting to estimate the weights of the motif graph, followed by a classical spectral clustering routine. To perform spectral clustering using the eigenvectors of the unnormalized Laplacian, it is sufficient to approximate the entries of the motif adjacency matrix up to constant multiplicative error, and then use the spectral clustering algorithm based on the spectral estimation algorithm of [ST14]. We can also perform spectral clustering using the normalized Laplacian, but in this case we lack the theoretical guarantees present in the unnormalized case. However, our numerical simulations suggest that spectral clustering with the normalized Laplacian on the approximate motif graph does actually work in practice -see Section 6 for details. We prove the following result in Section 6.3.
in expectation, where l is the maximum distance between any two anchor nodes in the motif.
Hence, we obtain a speedup over the classical algorithm whenever l < s/2. As a simple example, consider the case of a triangle motif, so that s = 3 and l = 1, and where the input graph is dense (d = Θ(n)). The classical run-time for motif clustering in this case is O(n 3 ), but the quantum run-time isÕ(n 2.5 ).
Note that, in contrast, constructing the motif graph approximately doesn't generally help us in the classical case: if we were to estimate the weights of the edges of the motif graph with constant relative error, then this would take timeÕ(nd l+s−1 ), which already for l = 1 is worse than even the exact version described above. Hence, using approximate counting only buys us something in the quantum case.
Quantum via approximate counting and quantum spectral clustering It is sometimes more efficient to provide query access to the approximate motif graph G rather than to construct it explicitly beforehand. We show the following in Section 6.3.3, by combining the quantum spectral clustering algorithm of Apers  This run-time is independent of whether we have to pre-process the input graph or not. Hence, whenever d l = ω( √ n) (which is the case for, for example, dense graphs), this algorithm is more efficient than the algorithm of Theorem 2 above which constructs the approximate motif graph explicitly. It should be noted that, if we choose to use this algorithm for motif clustering, we can, generally speaking, only cluster using the unnormalized Laplacian, because we lose the ability to filter out the vertices that become disconnected in the motif graph -see Section 5.2.3 for a discussion on this point.

Summary
In all of our quantum algorithms our speedups come primarily from faster computation of the weights in the motif graph, and in one also from the application of quantum spectral clustering via [AdW20]. In the worst case the speedup over the classical algorithm is minimal, but for many natural families of graphs the speedup can be reasonably large (i.e. quadratic). In Table 1 we summarize the (expected) complexities of the classical and our three quantum algorithms for performing motif clustering on a general graph using an arbitrary motif.  Furthermore, we consider the complexity for power-law graphs, the latter being a model of many naturally occurring graph families, such as social-networks and internet graphs [FFF11]. In particular, we find that, if we take the motif to be a clique with two anchor nodes, the speedup becomes more significant as the size of the motif grows. The corresponding run-time complexities are given in Section 6.4.

Anchor nodes
Our algorithms for constructing the motif graph work for motifs with an arbitrary number of anchor nodes 11 . However, as in the classical case, clustering by means of the motif graph G in order to obtain a clustering that approximately minimizes motif conductance or motif ratio cut in the original graph G only works for motifs with two or three anchor nodes.
In fact, in Appendix D, we argue that motif clustering by means of the motif graph should only be used for motifs with two anchor nodes -or weighted combinations thereof, to be made precise in Appendix D. The reason for this is that the motif graph, being a graph itself, only captures pairwise relationships between vertices, and a motif with two anchor nodes exactly expresses a pairwise relationship between its anchor nodes.
Motifs with more than two anchor nodes attempt to capture relationships between more than two vertices, and such relationships should be described by a hypergraph instead. This statement seems incompatible with the fact that Benson et al. perform motif clustering using the motif graph for three-anchor node motifs. However, as we show in Appendix D, clustering using a three-anchor node motif is equivalent to clustering with a specific weighted combination of two-anchor node motifs. This equivalence breaks down for motifs with more than three anchor nodes.

Classical algorithms for motif spectral clustering
In this section we follow [BGL16] and describe classical algorithms for finding a partition that approximately minimizes the motif conductance or motif ratio cut 12 . The algorithms consist of the following two steps: given the graph G and a motif M , first construct the motif graph G; second, perform k-means spectral clustering on G using either the normalized (resp. unnormalized) Laplacian in order to find a partition with a low conductance (resp. ratio cut). An overview of the run-times of the motif spectral clustering algorithms discussed in this section is given in Table 2, where n = |V | is the number of vertices and d the maximum degree of the original graph G, m = |E | is the number of edges of the motif graph G, s = |V M | is the size of the motif M , k is the number of clusters, and ϵ is the relative accuracy with which the eigenvectors of the Laplacian are approximated.

Algorithm
Specifics Run-time Note that m ≤ nd l ≤ nd s−1 , since any two anchor nodes of a given motif instance are in each others l-hop neighborhood, and trivially, also m ≤ n 2 . If we use the ϵ-approximate eigenvectors to perform spectral clustering (for which we only require constant ϵ > 0, see Section 5.2, and k is also constant), the time to perform the ϵ-approximate k-means step isÕ(nd l ), and therefore the construction of the motif graph becomes the bottleneck in the complexity of the entire computation -assuming k-means runs in nearly-linear time.

Constructing the motif graph
Let G be an n-vertex, m-edge graph, and M be a motif of size s. In order to construct the motif adjacency matrix A, we need to find all instances of M in G. The most straightforward (and in general the optimal) way to find all motif instances is to simply consider all n s s-sized subsets of the vertex set V and check whether they form motifs for every choice of anchor nodes in each subset. Checking if a given subset of s vertices forms a motif instance requires O(s 2 ) = O(1) checks, since s is constant. Therefore, the entire process of finding all motif instances takes O(n s ) time.
If we have adjacency list access to G and know that it has maximum degree d, then we can more efficiently search for possible motif instances: since the motif is always taken to be a connected graph, every motif instance can be found by (i) picking an initial vertex u ∈ V , and (ii) growing the motif instance from u by exploring the local neighbourhood of u in order to find s − 1 more vertices that might yield a match to the motif. Since each vertex has degree at most d, there are at most d s−1 possible choices of s vertices in the local neighbourhood of each vertex, and hence this process requires O(nd s−1 ) time to check all connected s-tuples of nodes for motif instances. We describe a classical procedure for constructing these subsets in Section 6.1.1.
The complexities presented above hold for general motifs. However, for certain motifs the time to find all instances can sometimes be faster -for example, all triangles in a graph can be found using Θ(m 1.5 ) queries [Lat08]; all induced and non-induced 'position-aware' motifs of size at most 4 can be found using Θ(m 2 ) queries [MS10]; and quadrangles can be found using Θ(m 1.5 ) queries [CN85] -see the appendix of [BGL16] for more details.

k-means spectral clustering
Given an integer k ∈ N, and (say, adjacency list) access to A, we can next proceed to search for a partition W 1 , . . . , W k that minimizes either the motif ratio cut or the motif conductance of G by minimizing the ordinary ratio cut or conductance in G as described at the end of Section 3.3. Both tasks, which are NP-hard for worst-case instances [WW93], can be tackled using k-means spectral clustering 13 (see [vL07] and references therein), which finds partitions that approximately minimize either the ratio cut or the conductance.
Whether spectral clustering minimizes ratio cut or conductance depends on whether it is performed using the ordinary or the normalized Laplacian of the motif graph. Let D be the diagonal weighted motif degree matrix of G, with coefficients D uu = s u , where s u is the strength of vertex u, and define the motif Laplacian L by and the normalized motif Laplacian by 13 k-means spectral clustering solves a relaxed version of the NP-hard conductance or ratio cut minimization problem. It outputs clusters that are close to the optimal clusters for well-clustered graphs [PSZ15].

Minimizing ratio cut
In order to find a partition with small ratio cut, we can perform spectral clustering using the unnormalized Laplacian L. This works as follows.
1. Compute the k eigenvectors of L corresponding to the k smallest eigenvalues, and let U be the n × k matrix containing the first k eigenvectors as columns.
2. For i ∈ [n], let u i be the i-th row of U . Each k-dimensional row vector u i ∈ R k can be thought of as a feature vector for the i-th vertex of V .
3. Cluster the vertices of V by performing k-means clustering on the n feature vectors

Minimizing conductance
If, instead of ratio cut, we want to find a partition with low conductance, we can apply spectral clustering to the normalized Laplacian L norm : 1. Compute the first k eigenvectors of L norm corresponding to the smallest k eigenvalues, and letŨ be the n × k matrix containing the first k eigenvectors as columns.
2. Let U be the matrix obtained by takingŨ , and renormalizing all the rows to 1, that is: 3. For i ∈ [n], let u i be the i-th row of U . Each k-dimensional row vector u i ∈ R k can be thought of a feature vector for the i-th vertex of V .
4. Cluster the vertices of V by performing k-means clustering on the n feature vectors Note that, for d -regular graphs, vol G (W ) = d |W |. As a consequence, for these graphs minimizing the conductance is equivalent to minimizing ratio cut.

Disconnected vertices
It is possible for certain vertices in G to become entirely disconnected in G because their motif degree is zero. If we want to use the normalized Laplacian for spectral clustering, then we first have to remove all such vertices since the normalized Laplacian is obtained by multiplying the original Laplacian by D − 1 2 . Moreover, if we were to perform spectral k-means clustering (using the unnormalized Laplacian) on G with V = V , the algorithm could just output several clusters containing a single disconnected vertex each and place the remaining vertices into one or more larger clusters to minimize ratio cut. From the perspective of the motif adjacency matrix, the clusters containing a single disconnected vertex are not very interesting. Hence, after the construction of G we should remove all vertices u ∈ V that have zero (motif) degree d u = 0 and put each of them in their own size-one cluster. The remaining vertex set {u ∈ V : d u > 0} will then be the vertex set of G on which we perform k-means spectral clustering. Note that this procedure yields a number of clusters that is equal to k plus the number of vertices in V that are no anchor node of any motif instance in G.
In the remainder of this work, we will not emphasise this practical detail and simply write V for the vertex set of G. Note that removing disconnected vertices does not affect our run-time upper bounds, since none of our algorithms run in sub-linear time (a single step of k-means takes time linear in n).

Complexity
For an n-vertex, m-edge motif graph G, it is possible to compute the eigenvectors of L or L norm in O(n 3 ) time via exact diagonalization. However, we can also use ϵ-approximate spectral clustering to find an an ϵ-approximation to the k smallest eigenvalues λ 1 , . . . , λ k of the (normalized) Laplacian together with a set of orthonormal unit vectors v 1 , . ST14,KLP15]. This set of unit vectors approximates the subspace spanned by the k smallest eigenvectors of L, and is suitable for performing spectral clustering, even for constant ϵ > 0 [PSZ15, AdW20]. Finally, we use k-means, which takes time T k-means adding up to a total run-time ofÕ(m + T k-means ), since one step of k-means already takes time nk 2 and ϵ is constant.
As described in the references above, the method for finding approximate eigenvectors makes use of a graph sparsification algorithm which, given the graph G, constructs a spectral sparsifier G S of G, and then uses the inverse power method on the graph Laplacian L S corresponding to G S . This method can also be used to construct approximate eigenvectors of the normalized Laplacian L norm , by applying the inverse power method to D −1/2 L S D −1/2 , where D is the degree matrix of the unsparsified graph G [KLP15].

Quantum motif clustering
In this section we present three quantum algorithms for motif spectral clustering, one using Grover search, and the other two using quantum approximate counting. All three algorithms consist of two steps: (i) construct the motif graph G, and (ii) perform spectral clustering on G. As discussed in Section 4, the bottleneck for motif clustering is in step (i), constructing the motif graph, and this is also where our contribution lies; for step (ii) we use either the spectral clustering algorithm based on the spectral estimation algorithm of [ST14] or the algorithm for quantum spectral clustering of Apers and de Wolf [AdW20]. We begin by introducing the quantum tools that we make use of, followed by a description in Section 6.1 of a classical subroutine for exploring the local neighbourhood of vertices in a graph according to a particular motif structure. In Section 6.2 we discuss how Grover search can be applied to find all motif instances in order to do motif clustering, and in Section 6.3 we use quantum approximate counting to construct an approximation to the motif graph, and discuss under what conditions this approximation can be used for motif clustering. We then compare the run-times of all approaches in Section 6.4. Subsequently, in Section 7, we provide details to justify the use of approximations in the context of spectral clustering.

Preliminaries
We will find the following quantum subroutines useful. For each, we consider a Boolean function f : [N ] → {0, 1} on N items, with t = |{i : f (x) = 1}| the (unknown) number of 'marked' items. We will assume that we have oracle access to f , i.e. a unitary O f that acts Lemma 2 (Grover search with an unknown number of marked items [BBHT98]). There exists a quantum algorithm Search(O f ) that, with probability at least 2/3, finds and returns Using Finally, we will use a quantum algorithm for spectral k-means clustering from Apers and de Wolf [AdW20], which itself uses the (classical) algorithm of Spielman and Teng [ST14] to find approximations to the first k eigenvectors of a graph Laplacian obtained via quantum graph sparsification.
Lemma 5 (Quantum spectral estimation [AdW20]). Given adjacency list access to an nvertex weighted graph G with m edges, there exists anÕ( √ mn/ϵ + kn/ϵ 2 )-time quantum algorithm that outputs, with high probability, an ϵ-approximation of each of the k smallest eigenvalues λ 1 , . . . , λ k of the graph Laplacian L, and a set of orthogonal unit vectors v 1 , .
It turns out that choosing ϵ to be constant is already enough to perform spectral clustering [AdW20], and hence Apers and de Wolf note that Corollary 2 (Quantum spectral clustering [AdW20]). There exists a quantum algorithm that, given adjacency list access to an n-vertex weighted graph with m edges, performs spectral k-means clustering on the graph in timeÕ( √ mn + T k-means ).
Classically, a fast algorithm for (approximate) k-means spectral clustering can be obtained by combining the spectral estimation routine of Spielman and Teng [ST14] with constant error with a k-means clustering algorithm, yielding the following: Lemma 6 (Spectral clustering [ST14]). There exists a classical algorithm that, given adjacency list access to an n-vertex weighted graph with m edges, performs spectral k-means clustering on the graph in timeÕ(m + T k-means ).

Exploring the 'motif neighbourhood' of a vertex
Here we describe a short classical algorithm which, given a vertex u, motif M of size s, and a sequence of integers I of length s − 1, can be used to return a pairing of s vertices around u to vertices in M , such that those vertices are candidates for a match of the motif in the graph. We call this procedure a 'tree walk', for reasons that will become apparent. More precisely, given a tree T of t vertices, a graph G, and a vertex v, a tree walk explores the neighbourhood of v in G by constructing the tree T locally out of the neighbours of v. The output is a list of size t that identifies vertices in G with vertices in the tree T . We give details of the tree walk in Algorithm 1, and show an example of two outcomes of a tree walk in Figure 2. For any fixed input, the tree walk algorithm takes time linear in the number of edges in the tree T . Note that it is possible for Algorithm 1 to return a list L shorter than desired (T ). This will not be a problem for us.
Algorithm 1 Tree walk for Each child vertex c of r do

4:
Let i j be the first element from I

5:
Let u be the i j th neighbour of v in G (if v has no i j th neighbour, set u = Ø)

6:
Let T c be the sub-tree rooted at c 7:

Motif clustering with Grover search
The most straightforward way to speed up classical motif clustering is to replace the search for motif instances by a Grover search. In doing so, we find all motif instances in G and construct G exactly, and provide a generic speedup over the classical approach. The advantage of producing the adjacency lists exactly is that we can then cluster based on both the normalized and unnormalized motif Laplacian with the usual theoretical guarantees. In Sections 6.2.1 and 6.2.2 below we prove Theorem 1.
Checking for motif matches in the graph We will routinely need to check if a set of s vertices in the input graph G corresponds to a match of the motif M . To do this, we need to check if all the edges (resp. non-edges) of M are present (resp. not present) in G. Since we only assume adjacency list access to G, this will incur some overhead. In particular, to If the adjacency lists are not sorted, then we can instead use a single application of Search from Lemma 2 to detect the presence of the edge with probability ≥ 1 − ϵ in time O( √ d log(1/ϵ)). If we apply this subroutine N ≥ 1 times and we want all the instances as a whole to succeed with probability at least 1 − C (C > 0 constant to choose to your liking), then we need ϵ = C/N by the union bound. The number of times the function Match is called is N = O( √ nd s−1 M ) in Algorithm 2, which is also the number of times the subroutine for checking edges is called (since s is constant). Because this is at most polynomial in n, it will only add a logarithmic overhead to the run-time of Search for edges, which therefore takes timeÕ(s 2 √ d) =Õ( √ d).

Constructing G via Grover search
Our first algorithm is a basic application of Grover search to find all matches of the motif within the graph, which with some short (classical) post-processing allows us to obtain the motif adjacency lists {J u } u∈V exactly (i.e. without errors on the weights). As we construct the motif adjacency lists, we keep them ordered according to some arbitrary but fixed ordering of the vertices in V . The algorithm makes direct use of the TreeWalk sub-routine from Algorithm 1.  If the motif M is symmetric under non-trivial motif isomorphisms, then Algorithm 2 will find each motif instance exactly S M times, where S M is the number of motif isomorphisms of M . Because s is constant, so is S M , and therefore we incur a constant overhead in the presence of motif symmetries; see Appendix A for details.

Clustering using Motif-Grover
In the previous section we established how to obtain adjacency list access to the (exact) motif graph G. To perform motif clustering, we apply k-means spectral clustering to G using the spectral clustering algorithm of Lemma 6, which results in a clustering that approximately minimizes the motif RatioCut (when applied to the ordinary Laplacian of G), or the motif conductance (when applied to the normalized Laplacian of G).

Motif clustering via quantum counting
If some reasonably weak conditions hold for the motif, then it can be faster to use quantum counting to obtain an ϵ-approximation of the motif graph (in the sense that the weights on the edges are approximated up to relative error ϵ), and then use this for motif clustering. In many cases a rough approximation to the graph is good enough for clustering, and we argue this both formally (in the case of clustering on the unnormalized Laplacian) and empirically (in the case of clustering on the normalized Laplacian) in Section 7.
In this section we present a quantum algorithm (Algorithm 5) for constructing the approximate motif graph using quantum approximate counting. As before, this algorithm can be combined with the algorithm of either Corollary 2 or Lemma 6 to obtain a quantum algorithm for (approximate) motif clustering. We begin by describing a quantum algorithm (Algorithm 4) for computing approximations to the entries of the motif adjacency matrix A, and then use this to construct the motif graph. More precisely, given vertices u and v, we provide a quantum algorithm that outputs an approximation A uv satisfying with probability at least 1 − δ for some choice of accuracy ϵ and probability of failure δ > 0.
The algorithm for approximate motif clustering described in this section assumes the motif M has two anchor nodes. However, our algorithm can easily be extended to motifs with more than 2 anchor nodes, since the motif graph can be constructed in this case by decomposing the motif into a combination of two-anchor-node motifs -see Appendix D.2 for details. Construct a spanning tree T = (V T , E T ) of M .

3:
Split T into two trees: an s a -vertex tree T a = (V Ta , E Ta ) rooted at a ∈ V A and an

4:
Let O f be the unitary that implements Match on the trees T a , T b , graph G, vertices u, v, and taking as input two integer sequences Proof. Our task is to approximately count the number of motifs present in the graph G that both u and v appear in (as anchor nodes). Algorithm 4 achieves this by using tree walks to explore locally the areas around u and v, in search of a set of vertices and edges that match the motif structure, and then uses approximate quantum counting to estimate the number of motifs containing u and v.
As described in Algorithm 4, we start by (classically) constructing a spanning tree T = (V T , E T ) of the motif M , for example using breadth-first search in time O(s 2 ) = O(1). Next, we remove an edge of the tree in such a way that the two newly formed trees contain one anchor node a and b each, which yields an s a -vertex tree T a = (V Ta , E Ta ) rooted at a and an Figure 3 for an example.
We then fix two integer sequences that uniquely define a tree walk on each tree: Since we only assume adjacency list access to the edges of G, this incurs some overhead. In particular, given vertices v 0 , v 1 from G, we must query all of the neighbours of v 0 for the presence (resp. non-presence) of v 1 to check for the edge (resp. non-edge) (v 0 , v 1 ). Since the adjacency lists are sorted, this can be done in time O(log d) (recall that s is constant). Finally, in the presence of symmetries within the motif, we note that the quantum counting routine will over count-motif matches. In Appendix A we work out exactly how many duplicates will be found, and show that this quantity, S we obtain our estimate A uv (note that this doesn't affect the accuracy of the estimate).

Constructing G via quantum approximate counting
As we discuss in Section 7, for the purpose of clustering it turns out that approximating the edge weights up to constant relative error is sufficient, and, as we will see, provides a speedup over the classical algorithm when the motif length l satisfies 2l < s. Again we construct G by explicitly constructing the motif adjacency lists for each vertex, as described in Algorithm 5.
2: for Every vertex u ∈ V do 3: Find the l-hop neighborhood N l u of u via breadth-first-search.

4:
for Every vertex v ∈ N l u do

5:
Use Algorithm 4 to obtain A uv , an approximation of A uv up to relative error ϵ and with probability ≥ 1 − δ n 2 . The output of this algorithm is a classical description of an approximation of the motif adjacency lists. These lists store approximations to the non-zero entries of the motif adjacency matrix A, and they satisfy, for all u, v ∈ V ,

Motif clustering with quantum counting and classical clustering
Given the approximate motif graph G constructed using Algorithm 5, we next proceed to cluster the vertex set of G, with the approximate adjacency lists { J u } u∈V used to provide access. We first consider spectral clustering on the approximate unnormalized Laplacian, meaning that the clusters aim to minimize RatioCut. The input is an unweighted graph G = (V, E) on n vertices and a motif M of size s with two anchor nodes at distance l from each other. We apply Algorithm 5 to obtain the adjacency lists of the approximate motif graphG up to fixed relative error ϵ > 0, and then use the algorithm of Lemma 6 to clusterG .
Let D be the diagonal matrix of (approximate) motif degrees obtained from A, and L = D − A be the approximate motif Laplacian. By Lemma 8 (which we will prove later), the spectral structure of the true motif graph Laplacian L of G is preserved by our approximation, i.e.
This property is necessary for applying the Spectral Clustering algorithm in Lemma 6, where it suffices to choose ϵ to be some small constant. Our quantum motif clustering algorithm is given in Algorithm 6 below, which can also be used to cluster using the normalized approximate motif graph Laplacian, though here we don't have the same theoretical guarantees -see next paragraph.
Algorithm 6 Quantum motif clustering via quantum approximate counting and classical spectral clustering Input: A graph G = (V, E), motif M with two anchor nodes and integer k ≥ 1.
hence proving Theorem 2. In case we need to to pre-process the input graph in advance, the run-time complexity will remain the same -see Appendix C.
Clustering with the normalized Laplacian As we will show in Section 7, the equivalent of Eq. (7) for normalized Laplacians does not hold in general. This means that, in principle, we cannot apply spectral clustering to the normalized Laplacian for the approximate motif graph and keep the same theoretical guarantees. However, we can still make use of the approximate adjacency matrix if, somehow, we happen to know the motif degrees exactly -see Section 7.2 for details. Unfortunately, computing the motif degrees exactly takes as much time as doing a Grover search over all motif instances, and then the run-time is the same as the run-time of Algorithm 2. Nevertheless, in the absence of a firm theoretical footing, the numerical simulations presented in Section 7.3 suggest that, in practice, we can use the approximate motif graph and the corresponding approximate motif degrees to cluster successfully using the normalized Laplacian.

Motif clustering with quantum counting and quantum clustering
In the case that d l = ω( √ n), it is possible to obtain a more efficient algorithm by not constructing the entire motif graph beforehand, but instead providing query access to it. To do this, we can assume that the motif graph is fully connected, but that non-edges have weight 0. Then, using Algorithm 4 of Lemma 7 with a constant ϵ, we can provide adjacency list access (which is now equivalent to adjacency matrix access) to the motif graph G using Õ ( √ d s−2 ) queries to the input graph G andÕ( √ d s−2 ) other operations, and then directly use the quantum spectral clustering algorithm of Apers and de Wolf from Corollary 2 to cluster, which will requireÕ(n 3/2 ) queries to the adjacency lists of G. This will yield an algorithm for performing motif clustering that takes time, which proves Theorem 3. As before, pre-processing the input graph in advance does not affect the run-time -see Appendix C. The process described above is encapsulated in Algorithm 7.
Algorithm 7 Quantum motif clustering via quantum approximate counting and quantum spectral clustering Input: A graph G = (V, E), motif M with two anchor nodes and integer k ≥ 1. We note that, by providing query access to the motif graph G rather than constructing it explicitly, we lose the ability to detect and remove isolated vertices in G. These will now be included in the graph provided as input to the clustering subroutine, which will almost certainly assign each isolated vertex to its own cluster. This may impact the quality of the solutions found by the motif clustering algorithm. However, as discussed in Section 4, we note that this version of the algorithm should only be used if d l and the total number of motif instance M are relatively large, in particular d l = ω( √ n) and M = ω(n 2 /d). In this case, the input graph is quite dense, and also there are reasonably many motif matches, making it not unlikely that the number of isolated vertices in G will be quite small.
Finally, we should note that for Algorithm 7, we can in general not use the normalized Laplacian for clustering, since constructing it could mean that we are dividing by zero due to some vertices possibly being disconnected.

Run-time comparisons
Next, we discuss how the run-time O(nd s−1 ) of the classical algorithm compares to the runtimes of the quantum algorithms introduced in this section. We will ignore the time it it takes to do k-means, which is the same for all algorithms considered, and we assume is nearly-linear in n.
In order to investigate the run-time of our Grover-based Algorithm 3, let us take the more general starting point and assume that the adjacency lists are initially not sorted. In this case, we can either pre-process the input graph (which includes loading the input graph to QRAM if needed), or we can use Grover search to search through the adjacency lists (assuming we have coherent access to the input graph).
The run-time of Algorithm 3 depends on the number of motif instances M . If we preprocess the input graph, we get a speedup over the classical algorithm when M = o(nd s−1 ), but this speedup is limited by the time it takes to do the sorting. If M = Ω(nd s−1 ), there is no speedup at all over the classical algorithm. In case of coherent access to unsorted adjacency lists of the input graph, we can also choose not to pre-sort the adjacency lists, but this only makes sense if the time it takes to run Algorithm 3 without pre-sorting is faster than the time it takes to sort the adjacency lists. Comparing the upper bound ofÕ( albeit that what will be best in practice will depend on how tight these upper bounds are. The quantum algorithms that use quantum approximate counting are a bit easier to analyse because they do not depend on M , nor on whether the adjacency lists of the input graph are sorted or not, or if we have classical or coherent access to them, as we can always preprocess the input graph beforehand. The run-time of Algorithm 6 isÕ(nd l+ s 2 −1 ), which for motifs with 2l < s provides a speedup over the classical algorithm. The run-time of Algorithm 7 isÕ n 3/2 d s 2 −1 , providing a speedup over the classical algorithm if √ nd s = o(d s ), which will be the case for, for example, dense graphs.

Clique motifs in scale-free networks
Let us investigate the run-times of the quantum motif clustering algorithms for a class of network that occurs often in practice: so-called scale-free networks [AJB99,VPSV02,BB03,Prž07,LMvH09]. Such networks have degree distributions that can be well approximated by a power-law distribution so that the fraction of vertices of degree h scales as h −τ for some τ > 1.
As an example, consider motif clustering with the motif being an s-clique with two anchor nodes. We take s ≥ 3 to be a constant independent of n. For clique motifs the distance between the two anchor nodes is l = 1.
In [JvLS19], the authors consider the number of s-cliques present in power-law random graphs on n vertices with parameter τ ∈ (2, 3) (they consider the so-called "hidden variable" model, see [CL02,BPS03,BDML06,BJR07]) and find that its expected value is given by O(n s 2 (3−τ ) ) as n → ∞. This means that for such graphs E[M ] = O(n s 2 (3−τ ) ), where the expectation is taken over the randomness of the input graph. The maximum degree in these power-law random graphs is given by d = n 1/(τ −1) [BPS03] (also known as the "natural cut-off").

Run-time upper bounds
Using the above upper bound for d together with Jensen's inequality, we obtain the following upper bound for the expected number of queries to the graph for Algorithm 3 without pre-sorting the adjacency lists: Similarly, we can obtain upper bounds to the run-times of the other quantum algorithms. These are given in Table 3.

Comparison of run-time upper bounds
The above (upper bounds for the) 14 run-times look somewhat complicated. Let's compare them to each other to see which algorithm has the fastest run-time depending on the choice of s and τ ∈ (2, 3). Intuitively, we expect that the Grover-based algorithms will perform well when there are few motif instances -i.e. when E(M ) = O(n s 2 (3−τ ) ) is small -which is the case for τ close to 3. As τ decreases, the graph becomes denser (since d = n 1 τ −1 → n as τ → 2), and we expect the quantum approximate counting based algorithms to do better. This intuition turns out to be correct.
First, observe that, since τ ∈ (2, 3), we have 1 ≥ 3 2 − 1 τ −1 , and therefore quantumapproximate + quantum cluster is faster than quantum-approximate + classical cluster, and so we should always use the former given a choice between the two. Second, comparing the two versions of quantum-Grover, we observe that we should only not pre-sort if 1 2 + s 2 which is true only when s = 3, and τ ∈ [τ 1 , 3), where τ 1 = 5+ √ 10 3 ≈ 2.72. In particular, for s ≥ 4, we should always pre-sort the adjacency lists given the choice between both quantum-Grover algorithms.
Next, we compare the run-time upper bounds of the competing algorithms for different values of s. For s ≥ 4, the only two competing algorithms are quantum-approximate + quantum cluster and quantum-Grover (pre-process). Note that, in this regime for s, and therefore quantum-approximate + quantum cluster is always slower than the time it takes to pre-sort the adjacency lists. Hence, all that remains is to compare the second term in the run-time of quantum-Grover (pre-process) to the run-time of quantumapproximate + quantum cluster. After simplifying a bit, the intersection of the two run-times can be found by solving which lies in the interval (2, 3) for s ≥ 3.
As an example, in Table 4 we list the explicit upper bounds to the run-times for the fastest algorithms as given above for the cases of s = 3, s = 4, and s = 5, in the limits of τ → 2 and τ → 3, and compare them to the run-time of the classical algorithm, (using the upper bound for the maximum degree given by the natural cut-off d = n 1 τ −1 ).

Clustering on an approximate graph
In this section we provide analytical and numerical evidence to support the claim that performing spectral clustering using the Laplacian or the normalized Laplacian, respectively, on an approximation of a graph yields similar clusters to those that would be obtained by clustering on the actual graph. This is of particular relevance to us since our use of quantum approximate counting in Algorithms 6 and 7 produces only an approximation of the motif graph, with each edge weight approximated up to some small constant relative error. We will consider the reasonably general case in which we wish to perform spectral clustering on some graph G = (V, E) with real-valued adjacency matrix A with non-negative coefficients, but where we only have access to an ϵ-approximationÃ of A, in the sense that (1 − ϵ)A uv ≤Ã uv ≤ (1 + ϵ)A uv for all u, v ∈ V . Note that A uv = 0 if and only ifÃ uv = 0, so A andÃ have the same edge set E (as long as 0 < ϵ < 1). Our central question is whether performing spectral clustering on the approximate graph yields clusters that are similar to the clusters obtained by performing spectral clustering on G itself.

Approximating the unnormalized Laplacian
If we choose to cluster using the (unnormalized) Laplacian, then it turns out that we can answer the question above positively. That is: if we perturb the weights on the edges of a weighted graph by adding a multiplicative error, the spectrum of the Laplacian is preserved up to a similar multiplicative error.
To make this precise, we first introduce some extra notation. Let G = (V, E) be a undirected graph with symmetric real-valued adjacency matrix A with non-negative coefficients. For a vertex u ∈ V , we define the indicator vector 1 u to be: Proof. We need to show that bothL − L + ϵL and L −L + ϵL are positive semi-definite. The inequalities (1 − ϵ)A uv ≤Ã uv ≤ (1 + ϵ)A uv for every u, v ∈ V imply that there exists a matrix γ such thatÃ uv − A uv = ϵA uv γ uv with γ uv ∈ [−1, 1] for all u, v ∈ V . Hence, which is the Laplacian of a graph G = (V, E) where each edge (u, v) ∈ E has weight ϵA uv (γ uv + 1) ≥ 0, and hence is itself a positive semi-definite matrix. Likewise, is also the Laplacian of a graph G = (V, E) where each edge (u, v) ∈ E has weight ϵA uv (1 − γ uv ) ≥ 0, and therefore is also positive semi-definite.

Approximating the normalized Laplacian
A natural question to ask is if a similar result to that of Lemma 8 also holds for the normalized Laplacian. That is, does the statement: for every δ > 0 there exist a ϵ > 0, such that if A andÃ satisfy (1 − ϵ)A uv ≤Ã uv ≤ (1 + ϵ)A uv for every u, v ∈ V , with corresponding Laplacian L = D − A, approximate laplacianL =D −Ã and corresponding normalized Laplacians given by hold? (Note that for Lemma 8 we have ϵ = δ.) First, we observe that, if we choose to normalize the approximate Laplacian with the true weighted degree matrix D − 1 2 rather than the approximate weighted degree matrixD − 1 2 , then the above statement does hold (for ϵ = δ).
Lemma 9. Let A andÃ be adjacency matrices of a graph G = (V, E) with real-valued nonnegative weights, and let ϵ > 0. Now, if (1 − ϵ)A uv ≤Ã uv ≤ (1 + ϵ)A uv holds for all u, v ∈ V , then the normalized Laplacian L norm = D − 1 2 LD − 1 2 and the approximate Laplacian normalized with true degree matrix Proof. Eq. (11) follows directly from Eq. (8), and the fact that, if X ⪰ 0, then X = B † B for some matrix B, and therefore As a consequence, for graphs for which we know the motif degrees exactly, we can use quantum counting to compute the matrix L ′ norm , which in turn can be used for k-means spectral clustering with ϵ-approximate eigenvectors. Unfortunately, if we do not know the motif degrees exactly, then quantum counting does not offer any additional benefit over Algorithm 2, since using quantum counting to count all the degrees exactly has the same complexity as finding all motif instances.
Next, we will show that Eq. (10) does not hold in general. Specifically, Eq. (10) does not hold unless, coincidentally, all perturbations are such that D = cD for some real-valued constant c. Consequently, we do not have the same guarantees for the approximate normalized Laplacian as we do for the approximate (unnormalized) Laplacian.
To this end, let 1 > δ > 0. We will show that, no matter how small we pick ϵ > 0, if L norm areL norm as described in and above Eq. (9), we do not have that In particular, we will show thatL does not hold regardless of how small we pick ϵ > 0.
We know that v = √ D1 is a 0-eigenvector of L norm , and likewiseṽ = √D 1 a 0-eigenvector ofL norm 15 . Now, let us first assume that the underlying graph for A is connected. In that case, v is the only 0-eigenvector of L norm , and all other eigevectors have an eigenvalue that is stricly positive. Consequently, as long as v andṽ are not linearly dependent, which happens when the degree matrices D andD are not constant multiples of each other, we havẽ This implies that the first inequality above does not hold for general ϵ > 0, regardless of how small we choose δ, unless all perturbations are such that D = cD for some constant c > 0.
If the graph is not connected, we can just restrict ourselves to a single connected component of A, and repeat the argument for that connected component.
The above observation essentially rules out obtaining strong guarantees in the case of clustering on approximate normalized Laplacians. However, in the next section we provide numerical evidence to suggest that, in fact, we can actually use the approximate normalized Laplacian for clustering in practice.

Numerical simulations for the approximate normalized Laplacian
Even though we cannot obtain theoretical guarantees in the case of clustering via the normalized Laplacian when the weights of the graph are only known approximately, we give evidence to suggest that, in practice, the situation is similar to the unnormalized case. Recall that in the latter, the spectrum of the graph Laplacian is preserved up to small constant multiplicative error when the weights on the edges of the graph are perturbed by a similarly small constant multiplicative error (see Lemma 8), and thus spectral clustering will perform similarly on the original and the perturbed graph.
In what follows, we study empirically what happens to the quality of the clusters produced by spectral clustering on the normalized Laplacian when we perturb the edges of weighted, randomly generated graphs. In particular, we consider weighted undirected graphs G = (V, E) with edge weights {w e |e ∈ E}, and their 'perturbed' versions with edge weights {w ′ e |e ∈ E}, where each w ′ e is drawn uniformly at random from [(1 − ϵ)w e , (1 + ϵ)w e ] for some relative error ϵ ∈ [0, 1].
We find that, in general, only large values of ϵ yield significant differences in the quality 15 1 is a 0-eigenvector of the unnormalized Laplacian L, and then we have that Lnorm of clusters obtained by spectral clustering applied to the normalized Laplacian of the graph.
As an illustration, consider the graph shown in Figure 4, a commonly used test-case for demonstrations of clustering algorithms, which consists of two concentric circles of points embedded in R 2 . Here we added edges between nearby points, 16 with a weight that scales inversely proportional to their Euclidean distance, and then applied spectral clustering to the normalized Laplacian of the resulting graph. Only after introducing a relative error of ϵ = 0.6 did we find that the resulting clusters differed at all from those found on the original graph. We note that this graph was even handpicked to demonstrate that perturbations can qualitatively change the clusters obtained from spectral clustering -in fact most graphs we generated were much more resilient to perturbations! This is almost certainly due to the fact that these graphs are 'well clusterable' in the sense that the two clusters are easily (albeit non-linearly) separable in a low-dimensional space. To more precisely quantify the effect of relative errors on graph clustering algorithms, we consider the difference in conductance (see Section 3) achieved by spectral clustering on the original and perturbed graphs. More precisely, let {W i } k i=1 be the partition (i.e. set of clusters) output by applying spectral clustering to the original graph, and {W i } k i=1 the partition found by applying it to the perturbed graph. Then we use the quantity as a measure of the difference in quality between the two partitions. Note that the conductance for both partitions is computed relative to the original graph G (i.e. without perturbed weights). Since spectral clustering aims to minimize the conductance, this is the natural 16 More precisely, we generated data points in R 2 representing two concentric circles using the Scikit-learn Python library, before scaling them to remove the mean (i.e. set it to zero) and obtain unit variance. Edges were then added between points x and y if their Euclidean distance satisfied d(x, y) ≤ 0.6, and given weight   quantity to capture the difference in quality of two different partitions of the same graph. If the partitions output by spectral clustering on the perturbed graph are worse, then ϕ diff will be positive; if they happen to be better, it will be negative.
As test cases, we consider two types of random graphs: 'cluster' graphs, which are created by generating random points in R 2 centred around some number of fixed centres, and then adding edges between nearby points with weights that scale inversely proportional to the Euclidean distance between them; and so-called LFR-graphs [LFR08], a commonly used family of random graphs used to test clustering and community-detection algorithms. For all tests we set k = 5 (and for the cluster graphs, generated data centred around k = 5 fixed centres). Table 5 shows the average value of ϕ diff for a range of sizes of randomly generated graphs of both types for a fixed relative error of ϵ = 0.1. We find that, regardless of graph size, the perturbation has essentially no effect on the quality of clusters found. Next we consider the effect of increasing relative error on the quality of clusters found by spectral clustering. Figure 5 shows, for randomly generated LFR graphs, the effect of increasing the graph size n (left figure) and of increasing the relative error ϵ (right figure) on the value of ϕ diff . It is clear that the clusters do not become worse as n increases (for fixed ϵ = 0.1), but do become worse as ϵ increases (here for fixed n = 2000). This suggests that, as in the case of clustering using the unnormalized Laplacian, it suffices to choose a small, but constant relative error ϵ to obtain good quality clusters via the approximate normalized Laplacian.

Conclusion
We have presented three quantum algorithms that provide a speedup over classical methods for performing motif clustering. Our speedup relies on quantum routines for finding or approximately counting the number of motif instances in the to-be-clustered graph. In the case of approximate quantum counting, we show that approximations up to only a constant relative error are sufficient for motif clustering using the unnormalized Laplacian of the motif graph, which allows us to obtain a quantum speedup in many cases. This observation in fact holds more generally: if we perturb the weights of a graph with some constant relative error, then the graph can still be used to perform spectral clustering via the unnormalized Laplacian, which produces clusters whose RatioCut is close to the RatioCut that would have been obtained by performing spectral clustering on the unperturbed graph. It is interesting that the effect of the perturbation on the quality of the obtained clustering is independent of the size of the graph.
Our argument for the above claim fails, however, for normalized Laplacians: in this case, the spectrum of the Laplacian corresponding to the perturbed graph does not preserve the spectral structure of the unperturbed graph. However, when applied to randomly generated benchmark graphs, we find numerically that clustering using the normalized Laplacian of the perturbed graph does in fact generate clusters for which the conductance is close to the conductance of the clusters obtained by performing spectral clustering via the normalized Laplacian of the unperturbed graph, again independent of the size of the graph. An interesting open question would be to find out why, and under what conditions, clustering with the normalized Laplacian of the perturbed graph can be used to obtain a clustering with low conductance in the original graph.
In Appendix D, we discuss motif clustering of a graph G using a motif with more than two anchor nodes. In particular, we show that clustering with a motif with three anchor nodes is equivalent to clustering with a weighted combination of two-anchor-node motifs. We continue to argue that, for motifs M with four or more anchor nodes, a weighted combination of two-anchor-node motifs M 1 , . . . , M q , corresponding to the motifs obtained by taking all possible pairs of anchor nodes of M , should be used in place of M itself. The reason is that the motif graph G constructed from G and M is itself a graph, and is therefore only capable of expressing pairwise relationships between vertices that occur as anchor nodes -which is exactly what the motifs M 1 , . . . , M q represent -and not higher-order relationships between sets of more than two vertices. If one wants the latter, then instead of constructing the motif graph G, one should construct a hypergraph where the hyperedges represent the multi-vertex relationships expressed by M , and cluster on the hypergraph instead.
There are existing classical algorithms for clustering on hypergraphs: see for example [TMIY20] and references therein for a clustering method based on the so-called 'personalized pagerank', which itself is based on the stationary distribution of a random walk on the hypergraph. An interesting open question would be to investigate if quantum walks can provide a speedup for the hypergraph clustering algorithm of [TMIY20].
Acknowledgements We would like to thank Ian Marshall for his active participation in the early stages of the project, and for the many insights he had during our regular meetings. We would also like extend our gratitude to Simon Apers for his useful feedback and for pointing out that there are cases where not constructing the motif graph explicitly is the most efficient approach, as well as to Ronald de Wolf for reading an earlier version of this manuscript and providing feedback. In addition, we would like to thank Johan van Leeuwarden for clarifying a result in [JvLS19], and Ton Poppe and Edo van Uitert at ABN AMRO for fruitful discussions on potential applications of quantum computing in transaction network analysis. Finally, we would like to thank an anonymous reviewer for pointing out a flaw in our proof of Lemma 11 and suggesting a fix.

A Motif isomorphisms
When using Algorithms 3, 6 or 7, we count the number of tree walks with the property that the graph G restricted to the vertex set of the constructed tree has the same edge structure as the motif M . Since each tree-walk is in one-to-one correspondence with a map ι : V M → V , this means we are actually counting motif assignments. However, when constructing the motif graph, given two vertices u, v ∈ V , we instead want to count the number of motif instances that have u and v as anchor nodes, and therefore we must know how to obtain the latter from the former 17 . As we will show in Lemma 10 below, every motif instance corresponds to a  Proof. We need to show that (i) any two motif assignments that correspond to the same instance are related by a motif isomorphism f : V M → V M , and (ii) that for any motif assignment ι, ι • f is equivalent to ι for any motif isomorphism f : For (i), let ι and ι ′ be two motif assignments that correspond to the same motif instance, is a composition of graph isomorphisms, hence a graph isomorphism, and trivially we have Similarly, if we are using Algorithms 6 or 7, which employ tree walks starting from two fixed anchor nodes (say a and b) to count motif assignments, rather than dividing by S M , we have to divide by the number S (a,b) M of motif isomorphisms that leave the two anchor nodes a and b fixed. Since Algorithms 6 or 7 assume the motif has two anchor nodes (which will be a and b), S only depends on the motif itself and can be computed in advance. Note that the group of motif isomorphisms as well as the subgroup of motif isomorphisms that leave two given anchor nodes fixed are both subgroups of the permutation group on s vertices. In particular, because we take s to be constant, so are S M and S (a,b) M , and therefore the overhead coming from the fact that we are counting motif assignments rather than motif instances is a constant multiplicative overhead that does not affect the complexity of Algorithms 3, 6 or 7. Indeed, following the proof technique from [BGL16], we get the following equalities:

B Motif graph cuts for two-anchor-node motifs
where A, D and L = D − A are the adjacency matrix, degree matrix and Laplacian of G respectively. Note that the factor 4 that appears in the one to last line to compensate for the factor 1 4 arises because the indicator function x takes values in {−1, 1} rather than {0, 1}.
C Pre-processing the input graph All our algorithms require coherent access to the adjacency lists of the input graph. Moreover, the algorithms based on quantum approximate counting in Section 6.3 assume that the adjacency lists of the input graph are sorted. If any of these conditions are not met, then we have to pay additional costs up front to pre-process the input graph, meaning that we have to either load the graph to QRAM, sort all adjacency lists, or do both. In this section we discuss what effect pre-processing the input graph has on the run-times of the algorithms presented in this work. Let G be the n-vertex input graph to which we have adjacency list access, with maximum degree d. Let M be the s-vertex motif that we want to use to cluster G with. If we have classical access to the input graph, then we first have to load it to QRAM in time O(nd). If we have coherent access to the input graph, but the adjacency lists are not sorted, then we have two options: sort all adjacency lists beforehand inÕ(nd) time, or keep the lists unsorted, in which case finding an element in an adjacency list takes O(d) classically, orÕ( √ d) using Grover search.
Recall that s ≥ 3, as s = 2 implies the motif is an edge, in which case there is no point in doing motif clustering. Also, the distance l (number of edges) between any two anchor nodes satisfies l ≥ 1.
Classical The classical algorithm of Benson et al. [BGL16] is happy to payÕ(nd) up front to sort the adjacency lists, as this always takes less time than the time it takes to find all motif instances. Hence, including pre-sorting the adjacency lists, the total run-time remains for the entire classical k-means motif spectral clustering algorithm.
Quantum via Grover search Given classical access, we have to pre-process the input graph beforehand in timeÕ(nd), resulting in an expected run-time of If we are given coherent access, and the adjacency lists are sorted, then we have an expected run-time ofÕ If we have coherent access but the adjacency lists are unsorted, and we choose not to sort them in advance but use Grover search instead to check if and where a given node in a given motif instance occurs in the adjacency list of another node in the motif instance (at the cost of an extra factor ofÕ( √ d)), then total expected run-time becomes Quantum via approximate counting and classical spectral clustering Since s ≥ 3 and l ≥ 1, we have that l + s 2 − 1 > 1, and therefore pre-processing the input graph in timẽ O(nd) does not affect the complexity of version of quantum motif clustering. The expected run-time of the algorithm including pre-processing remains O(nd l+ s 2 −1 + T k-means ).
Quantum via approximate counting and quantum spectral clustering As before, we can pre-process the input graph in timeÕ(nd) without changing the run-time of the algorithm: indeed, since s ≥ 3, nd ≤ √ n 3 d ≤ √ n 3 d s−2 . Therefore, the expected run-time of this algorithm including pre-processing is given bỹ

D Higher-order motifs
Next, we consider more closely the role of anchor nodes. Recall from Section 3.2 that anchor nodes in the motif are the nodes that determine which graph cuts count as a motif cut and which do not. For motifs with two anchor nodes, the motif itself expresses a pairwise relation between both of its anchor nodes, and motif clustering can be thought of as clustering the original graph after applying some kind of filter to it: a filter that removes all connections except those that fit the motif pattern, see Fig 1. We can also perform motif clustering for motifs with more than two anchor nodes, and as [BGL16] show, doing so makes sense also for motifs with three anchor nodes. At first, it seems that a motif with three anchor nodes expresses relationships between three vertices. However, by construction the motif graph is still a graph, which captures only pairwise relationships. This begs the question: what is the interpretation of clustering using a motif with three anchor nodes? In the sections that follow, we address this question in detail.

D.1 Multiple motifs
Recall from the construction of the motif graph that we add +1 to the weight of every edge (u, v) of the motif graph G for every motif instance in G with u and v as anchor nodes.
This suggests that the motif graph obtained from a motif with multiple anchor nodes can be seen as a sum of two-anchor-node motif graphs: one for each pair of anchor nodes in the original motif. In the next subsection, we will argue that clustering using a motif with three anchor nodes is equivalent to clustering based on a combination of motifs with two anchor nodes. Before we can make this statement precise, we first follow the supplementary material of Benson et al. [BGL16] in order to explain what it means to use motif clustering based on a collection of motifs 18 .
Given motifs M 1 , . . . , M q , coefficients α 1 , . . . , α q ∈ R such that α j > 0 for all j ∈ [q], and a vertex subset W ⊂ V , we can consider weighted motif cuts and the weighted motif volume We can also define the corresponding weighted motif conductance . and weighted motif ratio cut which naturally extend to partitions W 1 , . . . , W k of V as follows: for weighted motif conductance, and for weighted motif ratio cut.
Now, if we are interested in finding partitions of the vertex set V of G that approximately minimize the weighted motif conductance or weighted motif ratio cut in G with respect to weights {α j } q j=1 and motifs {M j } q j=1 then, as before, we can instead minimize ordinary conductance or ordinary ratio cut of the weighted motif graph defined below, as long as the motifs M j either all have two anchor nodes -the case we will use -or all have three anchor nodes.
In order to construct the weighted motif graph, we construct the motif graph G j with motif adjacency matrix A j for every j ∈ [q], and then take a weighted linear combination to construct a single weighted motif graph G Σ that combines all motifs according to the weights α j (which determine the relative importance of each motif M j ). Writing N A for the number of anchor nodes that all motifs have (recall that they should all have the same amount of anchor nodes, either two or three), then weighted motif cuts in the original graph can be directly related to ordinary cuts in the weighted motif graph. Specifically, given a subset W ⊂ V of the vertex set V of G, we have from Lemma 1 that where c = 1 if N A = 2, and c = 1 2 if N A = 3, and Consequently, we can apply apply spectral clustering to the normalized or unnormalized Laplacian of G Σ , respectively, to obtain a partition of V that approximately minimizes either the weighted motif conductance or ratio cut.

D.2 Motifs with more than two anchor nodes
Having introduced weighted linear combinations of motifs, we next turn to motifs with more than two anchor nodes. Let G be a graph, M = (V M , E M , V A ) be a motif with more than |V A | > 2 anchor nodes, and let G M be the corresponding motif graph 19 with adjacency matrix A M . In this section we will show that the motif graph G M constructed from G and M is equal to the motif graph obtained by taking a weighted sum of all two-anchor-node motifs contained in M , with the weights chosen as described below. Subsequently, we will use this result to show that motif clustering using a three-anchor-node motif is equivalent to clustering using a combination of two-anchor-node motifs.
D.2.1 The motif graph for motifs with more than two anchor nodes Next, for every two-anchor node motif K ∈ A 2 (M ), we define the weight ω K as follows. Let u, v ∈ V M be the two anchor nodes of K. Now, ω K is given by the number of ways the remaining |V A | − 2 anchor nodes (i.e all anchor nodes except {u, v}) can be assigned to the graph (V M , E M ) such that the motif structure of M is respected. In other words: ω K is the number of motif instances ι of M in the graph (V M , E M ) for which {u, v} ∈ ι(V A ). An example of a motif and the corresponding two-anchor-node weights is given in Fig. (8). Now, define the weighted motif graph G A 2 (M ) obtained by taking the weighted sum of all two-anchor-node motifs in K ∈ A 2 (M ) weighted by ω K . Its adjacency matrix is given by Indeed, for the middle two-anchor-node motif, the remaining anchor node (bottom left in M ) can be mapped to either two bottom nodes of (V M , E M ), as both mappings respect the motif structure, so the weight is 2. For the other two-anchor-node motifs, there is only one vertex that the remaining anchor node can be mapped to such that the motif structure is preserved.
where for each K ∈ A 2 (M ), A K is the adjacency matrix of the motif graph obtained from G and the two-anchor-node motif K.
Note that A A 2 (M ) is independent of the choice of representative from each motif isomorphism class 20 . In the example of Figure 7, the adjacency matrix A M 2 obtained by finding all instances of M 2 in G is the exact same matrix as A M 3 obtained by finding all instances of M 3 in G, and therefore (in this case that all weights are equal to one), . . , v s } of G, and let G ′ be the graph G restricted to V ′ . In order to compute what G ′ contributes to A M , we need to (1) find all motif instances of M in G ′ , and (2) for each motif instance, add +1 to the weight of each edge connecting two anchor nodes of the motif instance in question. We then do the same for all motifs that make up A A 2 (M ) , and check that the two contributions are equal.
There are two options, either there is a vertex assignment ι : V M → V ′ such that ι is a graph isomorphism from (V M , E M ) to G ′ , or no such assignment exists. In the latter case, G ′ contributes nothing, hence equally, to both A M and A A 2 (M ) , since all motifs in A 2 (M ) have the same edge pattern as M does. Thus, we need to only focus on the former case, for which G ′ is graph-isomorphic to (V M , E M ), and we can identify V ′ ≃ V M .
Assuming G ′ ≃ (V M , E M ), we now need to show that for every vertex pair u, v ∈ V ′ , (i) the contribution of all motif instances of M in G ′ to (A M ) uv is equal to (ii) the sum of contributions of motif instances in G ′ of all motifs K ∈ A 2 (M ) to (A K ) uv weighted by their weights ω K . Hence, pick a pair u, v ∈ V ′ ≃ V M . If no motif instance of M exists for which u and v are anchor nodes, then neither will a motif instance of any K ∈ A 2 (M ) exist that has u and v as anchor nodes, and the contributions (i) and (ii) will both be zero, hence equal. 20 By symmetry, different representative have the same weight.
If, on the other hand a motif instance of M exists for which u and v are anchor nodes, then (i) is given by the number of motif instances ι in G ′ ≃ (V M , E M ) for which both u and v are anchor nodes (i.e. u, v ∈ ι(V A )). By definition, this number is exactly equal to ωK, whereK is the two-anchor-node motifK = (V M , E M , {u, v}). Because A 2 (M ) contains one motif of each motif-isomorphism class of two-anchor-node motifs of M , there is exactly one motif in A 2 (M ) -eitherK itself, or one of the isomorphic two-anchor node motifs that we pick as representative -that has a motif instance in (V M , E M ) of which u and v are anchor nodes, and this motif has weight ωK. Therefore G ′ contributes the same, namely ωK, to both (A M ) uv and (A A 2 (M ) ) uv = K∈A 2 (M ) ω K (A K ) uv .
Since this analysis holds for every vertex pair u, v ∈ V ′ , we conclude that G ′ contributes equally to A M and A A 2 (M ) . Because the above analysis holds for every s-sized set of distinct vertices of G, we conclude that the adjacency matrices A M and A A 2 (M ) are equal, and therefore G M = G A 2 (M ) .
As a consequence, if we want to construct the motif graph G M for any given motif M , we can instead construct the motif graphs A K for every K ∈ A 2 (M ), and then sum the resulting motif adjacency matrices weighted with ω K to obtain A M .
We can also do the above approximately. That is, for some fixed ϵ > 0, if we have for every K ∈ A 2 (M ) an approximationÃ K of A K such that for every u, v ∈ V then by summing over K ∈ A 2 (M ) weighted with ω K , we will obtain an approximatioñ In particular, if eachÃ K were constructed via quantum approximate counting with relative error ϵ, then the sumÃ M = K∈A 2 (M ) ω KÃK will approximate (coefficient-wise) the motif adjacency matrix A M also up to relative error ϵ. 21 D.2.2 Three-anchor-node motifs as a combination of two-anchor-node motifs Using the results from the previous subsection, we can reduce the case of clustering with a three-anchor-node motif to that of clustering using three separate two-anchor-node motifs.
Let G be a graph, M a three-anchor-node motif with corresponding motif graph G M , and 21 If we use this in the context of Section 6, then to ensure that the entire procedure succeeds with probability at least 1 − δ, for given K and given u, v ∈ V we need to run Algorithm 4 with success probability 1 − δ/(|A2(M )|n 2 ) (rather than 1 − δ/n 2 ) in line 7 of Algorithm 4, which yields an extra factor of log(|A2(M )|) to both the query count and the number of other operations in Lemma 7.
let A 2 (M ) and G A 2 (M ) be as in the subsection above. Now, For any subset W ⊂ V , we have cut (G,M ) (W ) = 1 2 cut G M (W ) where in the first line we used Lemma 1 and the fact that M has three anchor nodes (hence the factor 1 2 ), in the second we use the fact that G M = G A 2 (M ) , in the third we use that G A 2 (M ) is a weighted sum of all two-anchor-node motifs K ∈ A 2 (M ), and in the last line we again use Lemma 1 for each two-anchor-node motif K. Similar to Eq (16), and using the exact same derivation but now with 'vol' replaced by 'cut', we find that the motif volumes are related by: Consequently, we have ϕ (G,M ) (W ) = ϕ (G,A 2 (M )) (W ) and RatioCut (G,M ) (W ) = 1 2 RatioCut (G,A 2 (M )) (W ) .
In conclusion, performing motif clustering on any three-anchor-node motif M is equivalent to performing motif clustering on ω K -weighted combination of all motifs in A 2 (M ) obtained by considering all possible pairs of anchor nodes of M , modulo motif isomorphisms.

D.3 Motifs with more than three anchor nodes, and hypergraphs
For motifs M with more than three anchor nodes, the motif graph G M is still equal to the motif graph G A 2 (M ) corresponding to the ω K -weighted sum of all two-anchor-node motifs in A 2 (M ). As a consequence, we can apply spectral clustering to the motif graph G A 2 (M ) to obtain a k-partition {W 1 , . . . , W k } of the vertex set V that approximately minimizes RatioCut (G,A 2 (M )) (W 1 , . . . , W k ) or ϕ (G,A 2 (M )) (W 1 , . . . , W k ) of the ω K -weighted sum of all motifs in A 2 (M ).
However, what no longer holds is that the above RatioCut and conductance are proportional to the RatioCut and conductance corresponding to the motif M . In other words: if M = (V M , E M , V A ) is a motif with |V A | > 3, then there does not exist in general a constant c > 0 such that for all subsets W ⊂ V we have that ϕ (G,M ) (W ) = c ϕ (G,A 2 (M )) (W ), nor does there exist a constant c ′ such that RatioCut (G,M ) (W ) = c ′ RatioCut (G,A 2 (M )) (W ). The reason is that, for a given motif instance, the factor of proportionality depends on how exactly the motif instance is cut.
For example, if the motif has four anchor nodes, then we see in Fig 10 that a cut in G that separates one anchor node from the rest in a given motif instance adds +3 to the corresponding cut in G M , but a cut through the middle that separates two anchor nodes from the rest adds +4 to the corresponding cut in G M . In contrast, both cuts contribute +1 to cut (G,M ) (W ) in the original graph G.
The above discrepancy is the reason why, in Theorem 9 of their supplementary material, Benson et al. [BGL16] subtract the sum of all motif instances for which the anchor node set is cut exactly in half by the cut in the graph. Note that, for a motif with three anchor nodes, any cut separates one anchor node from two other ones, and therefore the issue that arises with motifs with four anchor nodes is not present there.
To conclude, there are two ways of viewing a motif: one is that the motif expresses a single multiple-vertex relationship between all of its anchor nodes; the other that the motif represents the combination of all pairwise relationships between its anchor nodes. Because the motif graph can only represent pairwise relationships, clustering using the motif graph only works if we take the latter viewpoint.
More explicitly, for motifs with two anchor nodes, and also for arbitrary weighted combinations of motifs with two anchor nodes, we can do motif clustering by means of the motif graph. However, for any motif M with more than two anchor nodes, clustering using the motif graph