Distributing Multipartite Entanglement over Noisy Quantum Networks

A quantum internet aims at harnessing networked quantum technologies, namely by distributing bipartite entanglement between distant nodes. However, multipartite entanglement between the nodes may empower the quantum internet for additional or better applications for communications, sensing, and computation. In this work, we present an algorithm for generating multipartite entanglement between different nodes of a quantum network with noisy quantum repeaters and imperfect quantum memories, where the links are entangled pairs. Our algorithm is optimal for GHZ states with 3 qubits, maximising simultaneously the final state fidelity and the rate of entanglement distribution. Furthermore, we determine the conditions yielding this simultaneous optimality for GHZ states with a higher number of qubits, and for other types of multipartite entanglement. Our algorithm is general also in the sense that it can optimise simultaneously arbitrary parameters. This work opens the way to optimally generate multipartite quantum correlations over noisy quantum networks, an important resource for distributed quantum technologies.


Introduction
Quantum technologies hold the promise of faster computing, securer private communications [1][2][3], and more precise sensing and metrology [4,5]. Quantum networks open the possibility to explore these applications in distributed scenarios, allowing for increased performance and/or tasks involving multiple parties. In fact, quantum networks where the links correspond to quantum entanglement between the nodes, can be thought over long distances, e.g. inter-city, such as in the case of a quantum Internet [6,7], or locally, e.g. inside a laboratory or a quantum local-area network (QLAN) [8].
The immediate question that arises on a quantum network is how to optimally generate bipartite entanglement between two user-nodes, as a function of the relevant metrics (fidelity, rate,...). This encompasses selecting the appropriate protocols for entanglement generation and entanglement swapping [9], and algorithms that find the optimal way to execute them over a quantum network [10][11][12][13][14][15][16].
In this work we aim at finding the optimal way to distribute multipartite entanglement in noisy quantum networks, under a given distribution scheme. This has particular relevance for applications where noise and the distribution of the state [34] impacts the application itself. To that end, we introduce a new methodology that allows to maximize two different objectives -the rate of distribution and the fidelity of the distributed state -even though our approach is easily generalizable to include more. We develop an algorithm with tools from classical routing theory [35,36] that finds the optimal way of distributing a 3-qubit GHZ state, providing that the metrics that describe its distribution follow a set of properties, which we determine. We also find the conditions that yield optimality of our algorithm when considering a higher number of qubits, under one of the possible schemes. Moreover, our methodology is adaptable to different underlying physical implementations of the constituents of a quantum internet and its different stages of development [6].

Quantum Network Description
Let us first describe our model of a quantum network. Structurally, it is characterised by a graph G(V, E) where each node is denoted by a letter j ∈ V and the link connecting nodes i and j is denoted by i : j ∈ E. Each path is a sequence of links and we identify them by their initial and final node m : n. Note that even though we only use the first and last node to represent a path, it is just a matter of notation, as paths are concatenations of contiguous links, and there usually exist different paths between any two nodes. The set of nodes T ⊂ V we want to distribute the state to is called the terminal set, where |T| = T is the number of terminal nodes, or equivalently, the number of qubits of the distributed state. Each link is a noisy quantum channel, accompanied by a classical channel for signaling success and sending corrections, connecting neighboring nodes which individually hold qubits in imperfect quantum memories . Each quantum channel is capable of establishing entanglement between its two nodes and each quantum node must be able to realize one and two-qubits unitary gates, as well as performing measurements on its qubits [6,31,33,37] . The protocols for entanglement generation and entanglement swapping are considered to be probabilistic, following a geometric distribution in the number of trials before the first success [37]. Moreover, each entangled pair of the network is modelled by a Werner state [38] ρ W = γ |φ + φ + | + (1 − γ)/4 1, where |φ + ∝ |00 + |11 is a maximally entangled state, and with γ = (4F −1)/3, where F is the fidelity of the state. In the bipartite case, there is an equivalence between a Werner state and a representation with a depolarising channel having an equal amount of bit-flip, phase-flip and phase-bit-flip errors: ρ W = D F 1 (|φ + φ + |) = D F 2 (|φ + φ + |). This quantum channel is described in two alternative forms, one of them using the partial transposition operator Λ i on qubit i: (1) where the parameter p controls the amount of error in the state andX i ,Ŷ i ,Ẑ i are the usual Pauli gates acting on qubit i. This equivalence is significant to find the final form of the distributed state. In addition, this trace-preserving map has important properties, among them linearity on the main argument ρ, commutativity for every index i, and invariance under unitary operations applied to single qubits [39].

Distribution of Entanglement
Distributing entanglement is the building block for an entanglement-based quantum network. There are various schemes for doing such, whether be it in a bipartite scenario, between two parties, usually in the form of a Bell pair, or in the multipartite case, whereas there are more than two parties and multiple possible states, not all equivalent nor with identical applications. We start by first describing the case for bipartite entanglement, as it is often the first step of possible schemes for the multipartite case, for example the star scheme, which we detail next for the case of a 3-Qubit GHZ state. From [31], we make use of the protocol for distributing this state, that minimizes the number of consumed entangled pairs, and adapt it to our multi-objective optimization. In Appendices D and E we extend these results for a different scheme of distribution [31] and W states, respectively.
In both cases we identify metrics that attribute meaningful weights to a certain property of the distribution across the network. We do this in a way that guarantees each metric follows a set of properties (detailed in a Section 4), so to en-sure that our algorithms converge to the optimal solution.

Distribution of Bipartite Entanglement
To describe the metrics involved in distributing Bell pairs across a chain of quantum links, i.e. a path, we must concern the protocols to do so. These protocols are those of entanglement generation between neighboring nodes and entanglement swapping to extend the range of the entanglement. They will impact four chosen properties of the network: (i) the probability of success and (ii) communications times, which are self-explanatory, the (iii) fidelity of the state, a measure of how much noise exists in a quantum state, and finally the (iv) memory coherence time, a characteristic of a quantum memory that measures how much time can a qubit be in the memory before the state is useless due to noise. Note that the individual protocols for the entanglement generation in each of the links can be considered as a separate problem. The values for the probability of success, for the communication times and for the fidelity for each link are results from the entanglement generation protocols used. This makes each link agnostic from the physical implementation of the protocol. Nonetheless, we require a characterization of the protocol in these three quantities which are the input for the algorithms.
Starting with the probability of success, denote by p i:j the probability of successful entanglement generation between neighboring nodes i and j at the first attempt and by k j the probability of successful entanglement swapping on node j also at the first attempt. Considering every stochastic process to be independent, the probability of success of generating end-to-end entanglement across a chain of quantum nodes will also be geometrically distributed with a probability of success at the first attempt given by: (2) The classical communications time will also play an important role in the rate and each branch fidelity. We denote by t i:j = L i:j /c the time it takes to send a classical message between neighboring nodes i and j, distanced by L i:j . In our simplified model, there are two rounds of classical communication across the chain: i) one for communicating successful entanglement generation across every link of the chain, and ii) another to communicate successful entanglement swapping in every node of the chain, with correspondent corrections. This takes two times the sum of each individual time across the chain to complete each successful round: As for the fidelity across a chain after the entanglement swapping was completed, taking advantage of the γ change of variables, for Werner states, described in Section 2 where γ = (4F − 1)/3 [38] , results in a simple multiplication of each γ value along the chain. Note that this fidelity can also be affected by noisy operations required for performing entanglement swapping. In this case, one can include the errors from these operations directly in the fidelity of the entangled pairs via a mathematical trick [40], reducing this problem to simply a new value for the fidelity of each link as before. When using γ we have a threshold value at 1/3 (correspondent to F = 1/2), meaning that below this threshold, entanglement does no longer exist and the path cannot be used: In order to include the quantum memory coherence time, similarly to what has been done in [37,41], we adapted the contributions to our simpler scheme. Consider that, for Bell states, the effect of memory decoherence in the fidelity of the state verifies γ → γe −t wait /σ , where σ is the quantum memory coherence time. In the first round, if the generation is successful, entanglement is stored between every neighboring node of the chain. Therefore, the memory decoherence factor of each node, apart from the first and the last, contributes twice to the final state distribution. On the second round, if the swapping is successful, the entanglement is held only within the first and the last node. This results in every node contributing the same to the decoherence factor, given that the communications time, t wait , is identical: These four parameters can then be contracted to only three: [p m:n , t m:n , γ m:n , σ m:n ] → [p m:n , t m:n , F m:n ], (6) subject to (4F m:n − 1)/3 = γ m:n e −tm:n/σm:n . The final parameters in Eq. 6 are then the input of the multipartite metrics that characterize the fidelity and rate of the final multipartite state. In Fig.1 we present a scheme of each of the detailed parameters, and how they affect the final metrics. Nonetheless, the four parameters that are the input of Eq. 6 must be routed separately for the optimal paths (and consequently the optimal star) to be found, under appropriate algorithms.

Distribution of 3-Qubit GHZ states
Moving on to the multipartite case, let us consider the distribution of a GHZ state with three qubits. Similarly to what has been done in [31], to distribute GHZ states one would have to first find the Steiner tree connecting the set of terminal nodes and then perform some measurements over the Steiner nodes. The so called Steiner tree is the shortest-tree connecting the terminal nodes, where a tree is a graph with no cycles or, equivalently, only one path connecting any two nodes. We start with the case of 3 qubits, and later generalise for an higher number of qubits. The reason is that the Steiner tree connecting 3 terminal nodes is always a star-graph making the scheme equivalent to first distributing bipartite entanglement between every terminal and a center node and then performing a set of operations on the center node. Thus, the central idea is finding the center node for which this results in the optimal solution. To define optimality, we again concern the metrics for the distribution of this entangled state, namely the rate of distribution and the final state fidelity.
Consider that the center node, c, is capable of coordinating every process. There is an initial phase to signal the beginning of bipartite entanglement distribution, followed by the phase of creating bipartite entanglement in every branch connecting to the center node and finally performing the set of operations that result in the desired state distribution. Following [10], we can provide a measure for the average rate of distribution, ξ tree , considering the different steps in this scheme. This average rate is given by the inverse of the average time, t tree , it takes to complete the distribution. Considering a scheme of attempts until the first success, if any branch fails to create bipartite entanglement, every branch starts over, the final distribution of the probability of the first success will follow a geometric distribution. In such case, the average time it takes to completely distribute the multipartite entangled state is the communications time multiplied by the average number of attempts before the first success: where p c:τ and t c:τ correspond, respectively, to the probability of success and communications time of the path c : τ . If the operations performed at each Steiner node are not deterministic, this time will increase by multiplying the expected value of the stochastic process that characterizes the global set of operations. Moreover, one could also pursue an approach similar to the one in [42] to derive an average rate for the case where the center node waits for each path to conclude generating end-to-end entanglement. The fidelity can be calculated by applying a depolarising channel to each terminal node substituting the p value in Eq. 1 by the fidelity F c:τ ≡ F τ , of each branch connecting the center node c to the terminal τ . Using the second description of the depolarising channel, the result is: Under Eqs. 7-8, we can map the |T| paths' parameters to the tree's rate of distribution, ξ tree , and fidelity of the final state, f tree : These metrics output the values for the fidelity of the distributed state and rate of the distribution. They depend necessarily on the network, and on the parameters of the individual links and nodes, which are a result of the physical implementation. Therefore, the fidelity and the rate at which one can distribute GHZ states across the network depends on the underlying physical implementation, the chosen distribution protocols and parameters of noise of the setup. Nonetheless, as will be seen ahead, the routing algorithm only takes the parameters, the metrics, and outputs the optimal solutions for the distribution.

Algorithms for Optimal Multipartite Entanglement Distribution
Now that we can characterize both the rate of distribution and fidelity of the final state, we need to find the optimal star. Here, the definition of optimal is crucial. In the case we only deal with one parameter, e.g. the fidelity, there only exists one optimal solution, unless several solutions have the same fidelity. However, if we want to consider more parameters to optimize, a multiobjective approach is necessary. In [43], multiobjective shortest-paths (MOSP) algorithms are introduced by defining the dominance relation and using the Pareto optimality definition, as we will explain further ahead. This results in a set of optimal solutions that is usually larger than onethe Pareto front. The same relation is also valid and central for finding the optimal way to distribute a 3-qubit GHZ state. In [35,44], a more fundamental definition for these metrics for routing problems is detailed using algebras for routing, which we use inadvertently to refer to metrics as well. Definition 1. Algebra for Routing is an ordered septet (W, , L, Σ, φ, ⊕, f ) comprised as follows: W a set of weights, a total order, L a set of labels, Σ a set of signatures, φ a special signature, ⊕ a binary operation that maps pairs of labels and signatures into a signature, and a function f that maps signatures into weights.
Intuitively, these algebras are a mathematical object that transports each important aspect of routing, onto a mathematical object related with each concept. Namely: the weight paths can have belong to W , to each link corresponds a label in L, to each path corresponds a signature in Σ, binary operations ⊕ are used to define how paths are extended and total-orders to define the ordering of the parameters, i.e., which one is the better. There is also a special signature φ correspondent to the impossible path, important for disregarding some paths. Using this definition, some properties can be defined [35], guaranteeing optimality when finding the shortest-path with an appropriate algorithm (for intuition refer to

Definition 3. (Isotonicity) an algebra for routing is called isotone if
Given the multi-objective scenario analyzed, the dominance relation becomes essential. This comes from the fact that, while one path might be better for some parameters, another path might be better for other parameters. Given a set of algebras for routing each one characterizes one of the objectives and associated set of parameters, where i ∈ {1, ..., k}, being k the total number of objectives, or the total number of algebras. The dominance relation (b) Isotonicity: if path α is better than path β, then any extension with any link l, maintains the order; (c) Label-isotonicity: if path σ 1 is better than path σ 2 , then any tree t extended by path σ 1 is better than extending with path σ 2 .
is then given by the following definition, as detailed in [43]:

., k} and the strict order holds at least once.
This means that one path only dominates the other if it is better in every single aspect. This causes some paths not to dominate another, nor be dominated by other, e.g., path a has a better rate than path b, but path b has a better fidelity than path a. Unlike, for example, greater or equal than, this is not a total order , where the total means that for every two weights f (a) and From the example above, we can verify that sometimes path a does not dominate path b, and path b does not dominate path a either. This is the fundamental reason why we could not create an algebra for routing with the dominance relation defining the order of the algebra.
In the same manner, an algebra for trees can also be created. In this case, the labels define paths, rather than of edges (L → Σ), and signatures define trees, rather than of paths (Σ → Ξ). Using these algebras for trees, another property arises, enabling that the shortest-tree necessarily contains the possible shortest-paths if the schemes for creating paths and trees are different. This property is what we defined as labelisotonicity: From the scheme for creating a 3-qubit GHZ state, the main problem is finding the correct center node, such that the corresponding star is optimal. Thus, one first needs to find the shortestpaths between each of the terminal nodes and all the other nodes. This requires running the MOSP routine T times, followed by adding nondominated trees to the set of solutions while attempting every combinations of paths for each center-node. Moreover, this center node must be reachable from each of the terminals, allowing the search to be optimized: A := Set of solutions initialized empty; 3: for node ∈ terminal do 4: procedure Shortest-path(node);

5:
N odes reach ← Nodes reachable from every terminal; 6: for node ∈ N odes reach do 7: for Tree T = ∪ i,j P ath j (node, terminal i ) do 8: if T is non-dominated by every tree in A then 9: Add T to A; Proposition 1. For the shortest-star with 3 terminals, the paths connecting the center node to the terminals must be the shortest-paths, if the underlying algebras for trees are label-isotone.
Proposition 1 provides that Algorithm 1 converges to the set of optimal solutions for the 3qubit GHZ state problem. Considering that the solutions for the MOSP algorithms are optimal (see Appendix A), it can also be verified that the rate of distribution and fidelity of the final state are both monotone and label-isotone for any number of qubits in the star scheme. This ensures the optimality of Algorithm 1. We present the proof the algorithm converges to the optimal solution under the algebra properties in Appendix B and that all the multipartite metrics for GHZ and W states verify these properties in Appendices F, G and H.

Distribution of N-Qubit GHZ states
When considering a higher number of qubits, there are two possible schemes to distribute this state. The first is the star scheme, which is an extension of the one introduced resembling a stargraph, hence its name. Identically, it consists of generating bipartite entanglement between every terminal node and a center node, which can ultimately be any of the terminal nodes, and then projecting every qubit of the center node in the desired state. This scheme is capable of distributing any multipartite entangled state, besides GHZ states, for example W states, which we also detail and prove optimality in Appendices E and H. Using Algorithm 1 also leads to optimality under the same constraints on the algebras, if the same link can be used more than once. If the latter is untrue and this condition is only imposed a posteriori, i.e., from the set of solutions from the algorithm, taking only the ones where each link is used only once, then we can only extract a part of the set of optimal solutions and flag if there might exist other optimal solutions or not. Note that for three qubits, this condition did not matter since, if there was an overlap of links used, then there is always another better star, with the center node located elsewhere.
When we do not restrict ourselves to finding the best star, but instead try to find the best tree, then we should use the tree scheme [31], detailed in Appendix D, alongside with a multiobjective Steiner tree algorithm. Although this is always either equal or better at distributing GHZ states, the algorithm is much more complex. It is more efficient when the number of nodes grows larger, since it almost surely reduces the number of entangled pairs consumed. However, it cannot be used to distribute some states, such as a W state.

Simulations of the Algorithm
In Figure 3 we present the scaling of the complexity of this algorithm in two different types of networks, namely Erdös-Rényi networks [45,46], which capture some properties of a quantum internet, namely the small-world property, and random geometric networks [48], which convey the limitation in the length of each individual link in a quantum network [49].
Erdös-Rényi networks [45,46] are networks where the degree of a node, i.e., the number of neighbors a node has, follows a Poisson distribution characterized by an average degree λ . To build this network, one creates all the nodes and decides if a link between two nodes is attributed based on a probability p. One can recover the limit of Poisson distribution when the number of nodes is large enough n → ∞ and np → const = λ : Pr(degree(v) = k) = λ k e − λ /k!. On the other hand, random geometric networks establish links if two nodes are close enough, a property a quantum internet verifies locally. To create a random geometric network one gener-ates a grid of equal sides and generates n nodes with randomly distributed xy positions inside the grid. Then, depending on a graph property of the graph, the radius r, if two of these nodes are closer to each other than the radius, a link is established between such nodes. This radius r is intimately connected with the average degree of the graph [48].
Every link has a value of fidelity uniformly distributed in [f min , 1), and a probability of successful entanglement generation and probability of successful entanglement swapping distributed uniformly in [p min , 1).  [50] that guarantee each path contains entanglement and the final distributed state fidelity f GHZ trunc > 1/2. In the previous expression, we report d max to the diameter of a graph, which is the maximum distance of the shortest-path between any two nodes of the graph. This scaling is a way to partially guarantee the functional connectivity of the network, i.e., that one can find a path with a fidelity above 1/2 between any two points of the network.
Usually, the minimum value for the fidelity of a path f trunc is 1/2, for which entanglement is known to be present. However, in our case, to guarantee that most of the simulations would result in finding at least one optimal stars with f GHZ trunc > 1/2, we set the value for each path f trunc = 0.9. Using these values and all the metrics described throughout the paper, one can verify the properties necessary to prove the algorithm will find the optimal solutions (see Appendix A for the bipartite case and Appendices F, G and H for the multipartite extension).

Comparison with existing Multipartite Routing Algorithms
To compare with previous results presented in the literature, we made an analysis for two different routing protocols to distribute GHZ states with three qubits. The first method was based on Ref. [31], where performing routing to optimize the distribution of a GHZ state is identical to finding the shortest-star minimizing the communications time and the amount of entangled pairs consumed (or links). We call this a shortest-path routing method. This derives from the fact that the pairs are considered to be pure states, the probability of success is one, and all links have the same length. This case is the most intuitive one, as the communications time is bounded by the maximum length of each path connecting to the a terminal, and the lesser the number of consumed pairs, the higher are the values for the rate and fidelity that could potentially be achieved. The second method is a bounds-based method, following the work of [15,[27][28][29], where we perform the routing based on the maximum bound on what a protocol could achieve on distributing GHZ states. To each edge, one attributes a capacity that translates the loss in the channel, and the probabilistic nature of entanglement generation, namely C i:j = p i:j (1 − H 2 (3γ i:j /4)). Then, the capacity of distribution is bounded by the minimum capacity along the links that make a given path. Since the goal is to maximize the capacity of distribution, this is a max-min problem. To shorten some redundancy when two paths have the same minimum, we maximize the minimum capacity, and if two paths have the same capacity, we choose the one with the least amount of links. This guarantees a fair comparison with the other metrics. In Fig. 4 we plot the comparison between our optimal algorithm and routing under the above mentioned schemes. To do this, we perform the routing (optimization over the network) under the newly described metrics, and then calculate, using our derived metrics, the fidelity and rate for the solutions found. Note that the results never show a solution better than our optimal algorithm, and tend to be worse, as it was expected from the assumptions taken using those models of networks. Namely, for shortestpath routing, it is assumed homogeneity of the network and pure-states and, for bounds-based routing, it is assumed an information theoretical bound where the protocols that run at the ultimate limit that can be achieved for repeaterassisted quantum communications, which is less perceptive to how many links are used, as it functions as a bottleneck measure. When dealing with protocols where the fidelity decreases as the number of consumed pairs increases, this immediately creates a strong dependence on the size of the paths. For that reason, bounds routing achieves a comparatively worse result than shortest-path. In these plots, we used Erdös-Rényi networks with 1000 nodes. In Appendix I we present simulations made on models of classical internet-like networks, and provide the conditions and simulations that retrieve the results of routing under the model of Ref. [31].

Conclusions
With this work, we provided a methodology and an algorithm capable of optimizing multipartite entanglement distribution over noisy quantum networks. To achieve this, we calculated the necessary metrics: (i) the fidelity of the final state, depending on the noise and decoherence of the quantum memories, and (ii) the rate of distribution, which depends on the communication time required and the stochastic behavior of the pro-cesses involved. Our methodology can be used to optimize additional parameters, as long as the metrics verify the properties of monotonicity, isotonicity and label-isotonicity. When extending to higher numbers of qubits, under a starscheme, our algorithm is still optimal if we allow the links to be used more than once. Furthermore, by separating our approach into two equivalent layers, described by distinct algebras for routing and for trees, we allow bipartite entanglement distribution to be different from multipartite schemes. This opens the possibility for different protocols or algorithms within each layer and a condition that guarantees they fit together: label-isotonicity.
An additional important detail, which is exemplified by the chosen metrics for bipartite entanglement distribution, is that, while a set of parameters might not be isotonic, a decomposition into another set of isotonic parameters might still be possible. If this decomposition results in a larger number of parameters, as it did in our case, by increasing the number of objectives of optimization, the algorithms runtime also increases. This is partly derived from the fact that the set of optimal solutions grows. Nonetheless, doing so is compatible with our methodology, still providing a guarantee of optimality.
This same methodology could then be applied to different models of networks, for example having some nodes capable of distributing multipartite entanglement [51], or different distribution schemes [52], introducing entanglement purification rounds in the bipartite distribution scheme [33], or even different quantum repeater protocols [53], providing a starting point to optimally generate multipartite entanglement over noisy quantum networks.
Given the development of networked quantum technologies, both over a future quantum internet, and over QLANs, optimizing the distribution of multipartite entanglement will naturally impact the deployment of the communications, computation and sensing applications that use this important resource [54,55]. Future directions encompass using this algorithm with other metrics adapted to their physical realization of quantum networks or using this description of the metrics to find and verify the optimality of new algorithms for the multi-objective Steiner tree problem. While optimality might not always be the practical goal, these tools can be used to find a measure of closeness to optimality and possibly new routing protocols that are close to optimal. The broadness of this methodology should make for a robust tool to deal with routing on quantum networks.

A Bipartite Entanglement Distribution -Metrics Properties
As explained on the main text, the four parameters [p m:n , t m:n , γ m:n , σ m:n ] for bipartite entanglement distribution should be routed for independently. This way, we can guarantee that the contraction from these four parameters into only three parameters detailed in the main text provides the optimal solutions. This is only true since every of these metrics are both monotone and isotone, which can be verified in the following set of inequalities for the: In all the previous inequalities, the first corresponds to monotonicity and the second to isotonicity. When the metric is separable, as in all these cases, both properties are usually verified.

B Proposition 1 Proof
We will first present the proof for the case of one objective of routing, or only one algebra, using only its total order and then generalize for the case of an arbitrary number of objectives using the dominance relation. Moreover, we will do this for a star with an arbitrary number of terminals.

Proposition 1. For the shortest-star with 3 terminals, the paths connecting the center node and the terminals must be the shortest-paths, if the underlying algebras for trees are label-isotone.
Proof. Consider the more general case in which the center node is connected by n paths (this does not happen if the center node is one of the terminals, but for that case consider the star composed of n − 1 paths), indexed by a number between 1 and n: path 1 , path 2 , ...path n ∈ Σ. Each path is connected to one of n terminals. Fix all paths but path 1 . Let t ∈ Ξ be correspondent to the tree formed by path 2 ∪ path 3 ∪ ... ∪ path n . Now consider there ∃ path 1 : path 1 path 1 , due to label-isotonicity of the algebra for trees, then if path 1 path 1 ⇒ f (t ⊕ path 1 ) f (t ⊕ path 1 ) and the shortest tree would be path 1 ∪ path 2 ∪ path 3 ∪ ... ∪ path n . Doing this for every other path, we get that the shortest-star is the one with every branch being the shortest-path between the center node and the terminals.
Remark. The same applies for the multi-objective shortest-star problem, with the paths being the set of Pareto-optimal paths and requiring that every algebra for trees is label-isotone.
Proof. Consider that path 1 / ∈ X 1 where X 1 is the set of Pareto-optimal paths between the node 1 and the center node. Then, ∃ path 1 ∈ X 1 such that path 1 D path 1 and from here the star containing path 1 is better than the one containing path 1 . This comes from the way the dominance relation is defined. The rest of the proof is identical to the previous one.
Moreover, further refinements in the search process can be obtained, taking advantage of properties of the algebras. Namely, if the algebra for trees is monotone, then a necessary, but not sufficient, condition for existence of a shortest-tree connecting the set of terminal can be obtained: if there is no possible path connecting some pair of terminals, then there is no solution of the problem. From the structure of the algorithm, this condition can be implemented while finding the shortest-paths from each terminal node.
One important consideration is that the algebras for trees and the algebras for routing (parameters for trees and parameters for paths) do not need to be equal in number nor in form. As long as each algebra for routing is individually monotone and isotone, then the set of paths are optimal under an appropriate algorithm, which we will exemplify in Appendix C. And if each algebra for trees is label-isotone with respect to each individual algebra for routing it depends on, then this optimality is guaranteed.
Notice that in this proof we do not concern with different paths having intersections, as for the case with only three terminals, T = 3, this is never the case. For T > 3 our approach is only optimal if we allow intersections, which is equivalent to letting the same link be used more than once. If that is not the case, we can only extract at least part of the solutions and flag if there might exist more or not. If after running the algorithm and finding all possible solutions, with none of them having intersections, then all the optimal solutions were found. However, if some have intersections, then we can discard them (this is what we mean when we say a posteriori) but we cannot guarantee all optimal solutions were found, since we ignored some solutions with non-optimal paths that did not have intersections. This can be translated in the following way: consider there exists a star s = s 1 ⊕ p 1 such that s 1 ∩ p 1 = ∅, this is, they share a common link, and that s ∈ SA ≡ algorithm solutions. Now, consider there ∃p 1 : p 1 Dp 1 ands = s 1 ⊕p 1 and s 1 ∩p 1 = ∅, there is a chance ofs being optimal since it has no intersections, but never ends up in the algorithm solutions because it contains a dominated pathp 1 . If however, every solution in SA has no intersections and the algorithm is optimal, it would mean that any star with a non-optimal path, intersecting or not, would be dominated by at least one of the solutions and therefore would not be optimal.

C MOSP Algorithm
The multi-objective shortest-path (MOSP) algorithm [43] is a powerful tool to find the optimal path in the Pareto sense, which is then the input for our algorithm of distributing multipartite entanglement. We implemented it in the following way, finding every optimal path from the source to every other node of the network: Finds the shortest path to every node from the source 2: N odes := Set of nodes of the network, each with underlying list of paths P aths u initialized as empty;

3:
A := Set of visited nodes of the network initialized as empty;

4:
B := Set of nodes to visit ordered as a priority queue data structure, with priority defined by the dominance relation;

5:
Initialize source ← {e Σ i }; 6: Add source to B; 7: while B = empty do 8: node ← Top(B) 9: Remove node from B and add to A; 10: for v ∈ neighbors(node) do 11: P aths add ← possible paths from {P aths (i) node ⊕ Edge(node, v)}; 12: Update P aths v considering P aths add 13: if P aths v changes then 14: Add/Update v to/in B 15: if v ∈ A then 16: Remove v from A where {e Σ i } are the neutral elements of Σ i w.r.t ⊕ i . The main reason to separate the fidelity metric into three different metrics is that, while these parameters are individually monotonic and isotonic, together they are not. Then using this type of algorithm would fail at converging to the optimal solution that maximizes F m:n . However, by separating them and treating them independently using the dominance relation, one can clearly see that, if one path dominates the other, then: If not all partial relations on the left hand side are true, then the paths do not dominate each other and we cannot guarantee that, at every extension, the relation of the final fidelities would maintain. This requires keeping track of all the non-dominated paths -which constitute the set of Pareto optimal paths.

D Tree-Scheme
The tree scheme [31] consists of a generalisation of the star scheme, when we allow the topology of distribution to be any tree, not just a star-graph. Starting from a tree-graph connecting all terminal nodes (see Figure 5b), choosing a leaf node and iteratively applying the star expansion protocol [31] for every node will result in GHZ state distributed across the terminal nodes. Under this distribution scheme, finding the optimal way to distribute entanglement across the network is equivalent to finding the shortest-tree connecting the terminal nodes, under a given set of parameters. This is translatable to solving the multi-objective Steiner tree algorithm with the corresponding metrics. The regular Steiner tree problem is known to be N P -Hard [56,57] and adding the multi-objective setup quickly escalates the problem complexity. On the other hand, finding the shortest-star connecting a set of terminal nodes is a more tractable problem, and a reason why for 3-qubits, this problem is feasible. To find the metrics expression for the n-qubits GHZ state distributed in a tree-scheme, we considered one center of coordination in order to minimize the communications time. This center of coordination, in the case of the star scheme was trivially the center node. Now, since the coordinating node is not necessarily one branch away from every terminal node, we have to perform a minimisation over the possible locations for this node: where S is the set of the Steiner nodes, i.e., the set of nodes that belong to the tree, but are not terminal nodes. In our case, they are always intersections of branches. Moreover, notice that the numerator of t tree is the function that, given a tree, outputs the radius of the tree, under the communications time metric.
Moving to the fidelity and density matrix of the distributed state: we must start from one terminal node, τ 0 , making the result dependent from which terminal we apply the procedure. This does not invalidate the scheme described when calculating the rate, since this only affects which operations are made in each node. Starting from an initial node, after performing the star-expansion protocol across all of the intermediary nodes of the tree, the final form of the state is equivalent, up to qubit swap operations which leave the GHZ part invariant and the fidelity unaffected, to applying a depolarising channel to each terminal node except the initial: and applying, for each Steiner node, a depolarising channel to the initial node: Making the final result: In all the above, • stands for composition and a∈A represents the enumerated composition for every a in the set A. The fidelities of each depolarising channel correspond to the fidelities of each branch connecting either the terminals or Steiner nodes, respectively. The final fidelity of this state can be calculated, with the second description of the depolarising channel, given in the main text, revealing again to be particularly useful. Separating the calculations by the different entries of the density matrix of an GHZ state, let us start on the diagonal terms, dismissing the non-GHZ entries: with |0 0| and |1 1| denoting the density matrix entries |00...0 00..0| and |11...1 11...1|, respectively. The off-diagonal terms are actually eigenvectors, with eigenvalue (4F i −1)/3, of the depolarising channel, making the calculations very straightforward: The next step in the calculation is adding the Steiner nodes' depolarising channels. For this, let us first introduce two complementary functions, E(S) and O(S), that translate even and odd applica- A simple example of how to calculate this can be made, using Eq. 17 with only two Steiner nodes: We can completely define the functions recursively by: and find some of its properties, namely: where s ≡ {F s }. Adding to Eqs. 19,20,21,22, the missing depolarising channels of Eq. 18, they become: Hence, projecting the final state over a GHZ state, the final result for the fidelity becomes: Moreover, since every star is a tree, these metrics also apply for the star-scheme under the correct assumptions. It is easy to verify that in a star scheme, there is only one Steiner node (the center node) which translates in the functions E(S) and O(S) taking the values (1 + 2F τ 0 )/3 and 2(1 − F τ 0 )/3 respectively.

E W-States Distribution
W States are a well known class of multipartite entangled states. As pointed in [58], for 3-qubits, W states, together with GHZ states, compose the only two inequivalent classes of multipartite entangled states. For this reason we also include them here. Since the tree scheme introduced in Appendix D only works for GHZ states, the only scheme we analyse here for the distribution of W states is the usual star scheme introduced in the main text, where after the distribution of bipartite entanglement between all the terminal nodes and the center node, one would have to perform a projection of a W state in the qubits located in the center node. This would distribute the W state across the terminal nodes' qubits.
For this reason, the only metric that changes in the star scheme for the distribution of the W state is the fidelity metric which explicitly depends on the state, and, more exactly, on how the depolarising channel acts on the quantum state. To calculate the exact final fidelity metric let us first introduce the following notation we use henceforth to write the W state with n qubits: where |i := j =i |0 j ⊗ |1 i = |00...01 i 0...0 . Using this notation, let us describe the action of the depolarising channel onto each of the W states matrix entries |i j|, as we did for the GHZ state.
Using Eq. 1, we separate this channel into its two distinct Kraus operators: the first operator acts trivially on any of the entries of the matrix, leaving us only with the following operator to analyse: apart from the constant 2(1 − F τ )/3 and where we denote the states |0 = |00...0 and |i, j = |00...1 i ...1 j ...0 . Notice that both the state |0 0| and the state |i, τ j, τ | do never account for the fidelity. Moreover, after applying a depolarising channel on another qubit η, the entry |0 0| can contribute for the fidelity, whereas the same can never be said for |i, τ j, τ |, unless η = τ , which is never the case. Using these results, let us separate the terms that come from the diagonal entries and the terms that come from the off-diagonal entries. Consider that: where we made use of the previously introduced notation F i := (1 + 2F i )/3 and F i :

F Monotonicity for GHZ States
In this section we will prove that every algebra for trees used is monotone, for both the star-scheme and the tree-scheme detailed previously. Starting with the fidelity for the more general GHZ state, under any of the possible schemes which are detailed in the main text, we have to prove that the addition of any path to the tree results always in a worse fidelity.
By looking at the three different products in Eq. 30, and separating them into a vector of length three, corresponding to the signature, one can then describe the correspondent algebra for trees: f GHZ : [ 1 /2; 1) ∪ {0}, ≥, ( 1 /2; 1), (0; 1) 3 , 0, ⊕ GHZ , h with ⊕ GHZ : (0; 1) 3 × ( 1 /2; 1) −→ (0; 1) 3 given by: where {E(S)·a, O(S)·b, c} ∈ (0; 1) 3 is a general signature of a tree. Looking at Eq. 30 and considering that fidelities below 1/2 can be discarded, h(·) is given by: Now that we can fully characterize the algebra (or metric), let us prove the monotonicity property of this algebra. Considering one general tree, given by the following signature: After performing an extension with another path, as stated in Eq. 37, there are two possible options. Either the path connects to a terminal node τ or to a Steiner node s. In the case it connects to a terminal node, the metric is trivially monotonic since every element in the signature would be multiplied by a value smaller than one: In the case the path connects to a Steiner node then we need to use the know properties for E(·) and O(·) from Eq. 25 and prove that: The first expression is always true, considering the properties of E(·) and O(·) and that for any tree made up of n possible paths (paths with a fidelity F i > 1/2), 0 ≤ b ≤ ( 1 /3) n < ( 2 /3) n ≤ a ≤ 1 and F s − F s ≤ 1. The second expression is also always true as F s − F s ≤ 1. This proves the monotonicity of this metric.
Moving to the rate metric, consider again the more general case for the GHZ state under any of the possible schemes. The first thing to notice is that while performing a minimization over all nodes for placing the coordination center, one can never actually reduce the maximum of the communications time while adding another path. This comes from the fact that in a tree, if after adding another path, this coordination center position changes, it must change to a place that still has a larger communications time, or else the previous position was not the one for the minimum to be attainable. The corresponding algebra for trees then becomes: ξ GHZ : R + 0 , ≥, (0, 1)×R + , (0, 1)×R + , 0, ⊕ ξ , g with ⊕ ξ : (0, 1)×R + × (0, 1)×R + −→ (0, 1)×R + given by ({p tree , t tree }, {p m:n , t m:n }) −→ {p tree · p m:n , Radius(t tree ⊕ t m:n )}, (40) where {p tree , t tree } ∈ (0, 1) × R + is a general signature and Radius(t tree ) is a function that retrieves the radius of a tree under the metric t, which in this case is the communications time metric. Then, using Eq. 15, the algebra weight function, g(·), is given by where {p star , t star } ∈ (0, 1) × R + are again a general signature. Then, using similar arguments as in Eq. 41, g in this case is given by: where n m:n is the number of links of the path. As means of illustrating this, we performed more simulations, varying the parameter p min when constructing internet-like networks of 1000 nodes, taking the limit where p i:j could only take value 1, while setting γ min = γ max = 1, σ min = σ max = ∞, t min = t max = 1. This way, the only difference between paths is their time, which comes in multiples of the same value, depending on their total number of links.  Figure 7: Convergence from optimal routing for a 3-qubit GHZ state to shortest-path routing for the same state over internet-like networks with 1000 nodes, by varying the value of p min . The parameters utilized are: γ min = 1, t min = t max = 1, σ min = σ max = ∞. Each point is the result of 100 simulations over different samples of networks.