Classical simulation of linear optics subject to nonuniform losses

We present a comprehensive study of the impact of non-uniform, i.e.\ path-dependent, photonic losses on the computational complexity of linear-optical processes. Our main result states that, if each beam splitter in a network induces some loss probability, non-uniform network designs cannot circumvent the efficient classical simulations based on losses. To achieve our result we obtain new intermediate results that can be of independent interest. First we show that, for any network of lossy beam-splitters, it is possible to extract a layer of non-uniform losses that depends on the network geometry. We prove that, for every input mode of the network it is possible to commute $s_i$ layers of losses to the input, where $s_i$ is the length of the shortest path connecting the $i$th input to any output. We then extend a recent classical simulation algorithm due to Clifford and Clifford to allow for arbitrary $n$-photon input Fock states (i.e.\ to include collision states). Consequently, we identify two types of input states where boson sampling becomes classically simulable: (A) when $n$ input photons occupy a constant number of input modes; (B) when all but $O(\log n)$ photons are concentrated on a single input mode, while an additional $O(\log n)$ modes contain one photon each.


I. INTRODUCTION AND RELATION TO PREVIOUS WORKS
The recent paradigm of quantum computational advantage (or supremacy), is regarded as a promising route towards demonstrating that quantum computers are more powerful than their classical counterparts [1]. Although the systems used in this approach are not expected to be universal for quantum computation, and are often not known to perform explicitly useful tasks, they allow us to test quantum mechanics in the limit of high computational complexity. The pursuit of a near-term demonstration of quantum advantage is also aligned with the overarching goals of the field of quantum computing, as it pushes the development of technologies that will undoubtedly be necessary for scalable universal quantum computers, and allows us to better understand the effects of real-world imperfections in intermediate-size quantum systems.
One candidate for demonstrating quantum advantage is nonadaptive linear optics, or boson sampling [2]. In this model, an n-photon m-mode Fock state evolves according to a passive m-mode linear-optical transformation U and is subsequently measured by particle-number resolving detectors [see Fig. 1(a)]. This physical process leads to a probability distribution p U ( T ). The seminal contribution of [2] was to show that, subject to certain complexitytheoretic conjectures, for generic transformations U it is hard for a classical computer to generate samples from a distribution suitably close to p U ( T ). Besides being an elegant and physically-motivated computational model and, arguably, the first proposal of quantum supremacy as we understand it today, boson sampling also benefits from the technology and expertise developed by the quantum optics community in the last decades [3].
Many technological and theoretical challenges remain that can undermine the scalability of boson sampling. Several sources of imperfection affect linear-optical experiments, and it is essential to understand which can be mitigated and which degrade the computational power of the model. This boundary between classical simulability and quantum advantage is an area of intense investigation, with losses in particular receiving most attention. In [21] it was shown that boson sampling retains its computational power if a constant number of photons is lost. In the other extreme, recent papers showed that lossy boson sampling can be efficiently simulated when less than √ n photons are left [22,23] for arbitrary interferometers, or even when a suitably high constant fraction of the photons is lost for typical Haar-random interferometers [24]. Boson sampling has also been investigated under the effect of fabrication noise in the linear-optical components [25,26], losses combined with dark counts [27], partial photon distinguishability [28], and Gaussian noise in the experimental data [29].
In this work we build on our previous findings of [23] and solve some of the open problems posed there. The main result of [23] stated that, when less than √ n out of n photons are left, it is possible to approximate a lossy boson sampling state by a state of distinguishable photons, which is known to be classically simulable. This approximation is done at the level of the input state, and holds for arbitrary linear-optical experiments. For this result to work, it was necessary to assume losses happen at the input-a reasonable assumption when the interferometer is balanced, but which might fail for unbalanced network geometries [30]. To justify this, in [23] we also showed it is possible to commute s layers of mode-independent losses to the input of the network, where s is the smallest number of beam splitters in any path connecting any input to any output. However, it is possible to construct a universal linear-optical transformation where s = 1 [30], suggesting a possible route to circumvent the algorithm of [23].
Here, we extend the results of [23] in different directions, with the goal of adapting them for very unbalanced network geometries [30]. In particular, our main result is that a unbalanced network geometries cannot be used to circumvent classical simulation due to losses. Our intermediate results regarding the accumulation of losses in linearoptical networks and extensions of algorithm for simulation of boson sampling [17] can be also be of independent interest to the quantum optics and quantum complexity theory communities.
The paper is organized as follows. First, in Section II we give an informal overview of our results. Sections III-VI contain formal statements and proofs of our main results. Finally, Appendix A contains the technical details needed in the proof of Theorem 2.
Notation and main concepts. Throughout the paper we use the following notation. Given two positive-valued functions f and g, we . Finally, f ≈ g when lim x→∞ f (x)/g(x) = 1. By x ( x ) we denote the largest (smallest) integer smaller (larger) or equal to x. We denote the total number of particles by n, the total number of modes by m, and by [m] the m-element set {1, 2, . . . , m}. We alternatively refer to input modes as "bins", whenever there are many photons initialized in that mode (accordingly, we refer to this situation as bunching or binning). FIG. 2. Extraction of losses from an unbalanced photonic network. Blue rectangles correspond to beam splitters and associated phase shifters, while orange rectangles are loss elements. We assume all loss elements are equal, and use the fact that two equal loss elements in the arms of a beam splitter commute with it. We apply Result 1 to the triangular decomposition of Reck et al [30] in (a) as an example. It is possible to commute to the input of the network different rounds of losses according to the shortest path between that input and any output, leading to the equivalent network represented in (b).
The Hilbert space of photonic states in m modes is spanned by Fock states, characterized by a definite photon number s i in each optical mode i. In what follows we denote Fock states by | S , where S = (s 1 , . . . , s m ) is a vector of occupation numbers. In the formalism of second quantization we write , where a † i is the standard creation operator associated to bosonic mode i and |0 is the vacuum state. Passive linear optical transformations are associated with m × m unitary matrices U , which induce the transformation of modes a † i → a † i = m j=1 U ji a † i . When this transformation is applied to input state | S the output probabilities are given by where Per(A) denotes the permanent of A and U ST is an n × n matrix constructed according to the following prescription. We first construct an m × n matrix U S by taking s i copies of the ith column of U , then construct an n × n matrix U S,T by taking t j copies of the jth row of U T . Boson sampling can also be framed in the language of first quantization, which we briefly use (see e.g. the introductory section of [23] for the translation between the two descriptions).

II. OVERVIEW OF MAIN RESULTS
In this Section we give an informal summary of our main findings and the logic that connects the intermediate steps leading to our main result, namely, Theorem 4. We also discuss the relation to previous work.
Result 1: Extracting nonuniform losses from the network. We improve the procedure of [23] to commute losses to the input of a network. For simplicity, we make the following assumptions regarding losses. We assume that all losses happen within the network and are located at beam splitters, and that every beam splitter carries the same amount of loss. Although losses due to imperfect sources and detectors are experimentally important [14], they are in principle constant, whereas losses due to linear-optical elements dictate how overall transmissivity of the network scales as the experiments become more complex. Furthermore, the geometry of a network is the main source of loss non-uniformity which we wish to study. Experimentally, beam splitter losses are also relevant e.g. in integrated photonic devices, since waveguide bends required to build directional couplers can cause photons to leave in unguided modes [5]. We represent a lossy linear-optical as in Fig. 2(a).
The main figure of merit for the result of [23] is the length of input-output paths inside a network, measured in the number of beam splitters a photon has to traverse along that path. Previously it was known that, if the shortest input-output path has length s, we can rewrite the network in such a way that it has s layers of mode-independent losses at the input. However this is not particularly useful for unbalanced networks such as that of Reck et al [30], depicted in Fig. 2(a), as in that case we can only pull out one layer of uniform losses.
Our new result improves on this by showing how to extract nonuniform loss layers. More specifically, let s i be the shortest path between input i and any output within some network. We show how to write an equivalent network that is preceded by a round of losses where input i suffers the effect of s i consecutive loss elements. The precise formulation of this result can be found in Theorem 1. Our proof is efficient and constructive, and works for networks of arbitrary geometry. The final result is illustrated in Fig. 2(b) for the case of the network of Reck [30]. Overall, our result gives a more efficient description of the network where asymmetry is taken into account, and which we use to prove result 4 described later.
Result 2: Classical simulation of boson sampling with collision input states. It is well known that boson sampling is efficiently classically simulable if all n photons are initialized in a single mode, i.e., input state |n, 0, 0, . . . , 0 . This follows from the fact that the permanent of a matrix of n repeated columns can be computed trivially in linear time [2], or alternatively from the fact that if all photons are initialized in the same mode they behave as distinguishable particles [23]. Here we prove a stronger version of this result, by identifying two new types of input state which allow for efficient classical simulation of boson sampling [see Fig. 1(b) and Fig. 1(c)]. The first (type A) corresponds to n photons concentrated in a constant number of input modes. The second (type B) is when all but O(log n) photons are concentrated on a single input mode, while an additional O(log n) modes contain one photon each.
Concretely, we provide a strong simulation of boson sampling (i.e. computing outcome probabilities) by showing how an expression for the permanent function from [31] can be computed efficiently for inputs of types A and B (though a similar scaling follows from an Appendix of [32]). We then adapt a result of [17] for weak classical simulation (i.e. producing samples from the correct distribution) to allow for bunched inputs, showing that it is efficient for inputs of types A and B. Our adaptation of the result of [17] is also based on a physical description of the state (via first quantization) rather than combinatorial considerations, and we believe it could make the original result more transparent for researchers with a stronger physics background. The technical formulation of this result can be found in Theorem 2 and in Corollary 1.
Besides giving us a new understanding of the regimes where efficient classical simulability of boson sampling is possible, and ruling out attempts at demonstrating a quantum advantage with highly concentrated bosonic input states, this result also serves as a nontrivial stepping stone towards result 3 described below.
Result 3: Classical simulation of lossy boson sampling a few with lossless modes. We improve the classical simulation algorithm of [23] by allowing the lossy boson sampling instance to be coupled to some lossless modes. Concretely, suppose that n photons pass through a lossy channel in such a way that less than √ n survive, and are then combined with up to c log n photons input in modes that have perfect transmission. We show that the classical simulation based on the ideas similar those of [23] works for this setting as well. The technical formulation of this result can be found in Theorem 3.
The classical simulation of [23] was based on approximating a lossy bosonic state ρ by another state σ where photons effectively behave as distinguishable particles, then leveraging known results for classical simulation in that case [2]. Our new result is based on appending up to c log n single-photon states to σ, and using this state to simulate the state obtained by also appending c log n photons to ρ. The state σ is a convex combination of states of type B so we can leverage result 2 above. The distance between the two states (and hence the error bound on the simulation) is the same as in [23], but the simulation is more costly due to the use of Theorem 2. This result is also essential to prove result 4 that follows.
Result 4: Nonuniform losses do not avoid classical simulation. By combining results 1-3 we obtain our main result. Informally it states that the classical simulation algorithms of [22,23] cannot be circumvented by the design of unbalanced networks such as that of Reck et al [30] (see Theorem 4 for the formal statement). This was left as an open question in [23]. The idea was that, in an unbalanced network, there might be some input modes where losses are very mild, and the high chance of survival of these photons might break the assumptions behind the simulation algorithm in [22,23]. Here we show that classical simulation remains possible as long as only O log(n) inputs have small optical path to the output. Our result can be immediately applied to lossy interferometers with the geometry proposed in [30]. This closes one avenue for scaling up boson sampling experiments, but is also relevant for current experiments since this geometry is common in integrated-photonic implementations [5,10].
In what follows we outline our argument for the simulability of the triangular construction of [30] [cf. Fig. 2(b)]. This construction has the property that the shortest path from mode i to any output has length i (if we label modes starting from the lowest one). Suppose the input state we wish to simulate occupies the bottom n modes in an attempt to avoid losses as much as possible. To simplify the argument, suppose now that the bottom c log n modes are lossless (or include their loss channel as part of the network rather than the input state). By result 1 we can extract more than c log n loss elements for all modes above mode c log n + 1. However, as discussed in [23], given a per-beamsplitter loss parameter η, there is always some c such that, if an n-photon state suffers c log n rounds of losses, we expect less than √ n photons to survive on average. Therefore, the input to the network is amenable to approximation by a convex combination of stats of type B, as described in result 3, and so efficient classical simulation is possible.

III. EXTRACTING NONUNIFORM LOSSES FROM A LOSSY NETWORK
Linear-optical networks are the natural implementation of general multimode interferometers. They are typically built by composition of simple few-mode transformations, often easier to realize in practice, which are then arranged in some network layout (see [9,30] for the commonly used geometries, [33] for a mathematical perspective, and [34] for the generalization to non-unitary transformations). At the same time, each small transformation also introduces some loss. The purpose of this Section is to provide an effective description of lossy linear-optical networks that takes into account their geometric properties.
Our focus is on losses induced by linear-optical elements within the circuit, such as beam-splitters, which determine how losses scale with the size of experiments. We disregard losses in detectors and sources, as well as uniform losses associated with free-space or waveguide propagation. These do not depend on the network layout and can be easily incorporated into our description as additional losses at the input of the network. A loss channel that acts uniformly on m modes is parameterized by a transmission probability η, and its action on an n-photon state can be written as: where n l η l (1 − η) n−l is the probability that l out of n particles remain. This is equivalent to adding an extra beamsplitter that, with amplitude √ 1 − η, diverts each photon into some inaccessible mode [see Fig. 3(a)]. It can be easily shown that Λ η2 • Λ η1 = Λ η1η2 .

FIG. 3. (a)
A loss element with transmission probability η, which we represent by an orange rectangle, can be modeled by a beam splitter that, with amplitude √ 1 − η, reroutes each photon into an inaccessible mode. (b) We assume each beam splitter in the network is accompanied by two identical loss elements at its inputs. Uniform losses commute with linear optics, a property we apply repeatedly at the level of beam splitters to deal with nonuniform losses in general unbalanced networks.
It is well-known that a layer of identical loss elements commutes with any linear-optical network acting only on the same modes. In Fig. 3(b) this is represented for m = 2. This is the rationale for a standard assumption that losses happen at the input of a network, as it is usually assumed that they are approximately mode-independent. In practice, however, losses can also occur in nontrivial optical elements composing a network. In what follows, we assume for simplicity that each beamsplitter carries two identical loss elements at its inputs. In [23] we showed the following. Given some linear-optical network built out of the building block of Fig. 3(b), let s be the smallest number of beam splitters that a photon has to traverse within the network starting from any input to reach any output. It is possible to rewrite this network as a different (possibly still containing some losses) linear-optical network preceded by s uniform layers of loss elements. This formalized the assumption that losses happen at the input by basing it on a geometrical property of the network, namely the smallest number of loss elements that will affect any photon as it propagates inside it.
The above result works very well for symmetric networks, such as that of Clements et al [35], depicted in Fig. 4(a), but not for unbalanced network such as the triangular decomposition of Reck et al [30], depicted in Fig. 4(b) (often used in integrated photonics [5,10]). For the construction of Reck et al specifically, the previous result only allows commuting a single layer of losses to the input. This led us to conjecture it might be possible to circumvent the efficient classical simulation of [23] by using purposely unbalanced networks. Motivated by this we now prove a stronger result that allows us to treat each mode individually. Namely we show that for, each input mode i, it is possible to pull s i loss elements to the beginning of the network, where s i is the smallest number of beam splitters in any path between mode i and any output mode. Theorem 1. Let N be an m-mode linear-optical network composed of beamsplitters, each associated with a transmission probability η. Let the length of a path connecting some input and output modes be the number of beam splitters that a photon encounters along that path. Let s i be the shortest path between input mode i and any output mode. The quantum channel associated with the entire network can be decomposed as follows: Here Λ η is a nonuniform loss channel described by η = (η 1 , η 2 , . . . , η m ), where η i = η si is the transmission probability for mode i, and ΛÑ is another (possibly still lossy) linear-optical network that can be described explicitly and efficiently.
Proof. We assume without loss of generality that the network is arranged in D layers, where each layer contains at most m/2 beam splitters acting on disjoint pairs of modes. Under our assumption that each beam splitter is preceded by one loss element in each of its inputs, the length of some input-output path corresponds to the number of loss elements the photon encounters in that path. We prove the Theorem by iteratively applying the identity of Fig. 3(b). We begin with the last layer in the network, and build it by progressively adding each previous layer one beam splitter at a time. At each step we commute losses through the new beam splitter to make sure the new network has a specific form.
Consider initially the more general situation of Fig. 5(a). We have some linear-optical networkM. It is composed of a network M, that can contain any number of beam splitters and loss elements distributed in an arbitrary manner, preceded by a layer of loss elements with transmissivity µ i in mode i. We add to the beginning of M a new beam splitter, B, acting on modes i and j (and two corresponding loss elements with transmissivity η). We cannot necessarily commute µ i and µ j through B, as they may not be equal, but we can commute losses corresponding to the smallest of them. Suppose µ i ≤ µ j , and write µ j = (µ j /µ i )µ i . We can commute two loss elements µ i through B, leaving behind a loss element µ j /µ i in mode j. We are now left with the situation of Fig. 5(b).
Let us now return to our network N and apply the above procedure. Start with the last layer of beam splitters. Clearly it can be written in the form ofM of Fig. 5(a). We now add each previous layer, one beam splitter at a time, always applying the argument of the previous paragraph and Fig. 5. Suppose that, at step k of this iteration, we have some intermediate networkM k as in Fig. 5(a). Suppose also that µ i = η ti where t i is the length of the shortest path between input mode i of M k and any output. In other words, we assume that at step k the intermediate network satisfies the statement of our theorem.
Consider now the next beam splitter to be added to the network, B, acting between modes i and j. By the reasoning of Fig. 5 we can now commute min (t i , t j ) loss elements through both modes of B. The new network consisting ofM k and B has three properties: (i) the shortest path starting from either input i or j has length min (t i , t j ) + 1, (ii) there are min (t i , t j ) + 1 loss elements at both inputs modes i and j, and (iii) the action of the network on modes other than i and j is unchanged.
Clearly the new network,M k+1 , also satisfies the statement of our theorem. Therefore, by induction, this holds all the way until the network N is complete. Furthermore, this procedure is constructive by also giving explicitly the form ofÑ , and so the theorem is proved. Remark 1. As stated above, our result can be easily extended to include losses in the preparation and measurement stages of the experiment. Indeed, let Λ ηin and Λ ηout be channels describing (possibly non-uniform) losses occurring during the state-preparation and readout phases respectively. The quantum channel describing the whole physical process is then given by Λ tot = Λ ηout • Λ N • Λ ηin . Combining this with (3) we get a decomposition where ΛÑ = Λ ηout ΛÑ and η = (η in,1 η 1 , η in,2 η 2 , . . . , η in,m η m ). Moreover, inspection of the proof of Theorem 1 shows that the decomposition (3) holds also in a network in which different optical elements are associated to different loss probabilities. Therefore the qualitative characterization given in (3) can be also applied in that more general scenario.

IV. SIMULATION OF BOSON SAMPLING WITH BINNED INPUTS
In this Section we describe two new types of input states for which boson sampling is easy to simulate classically. One of them (type B) will be used later in a classical simulation for optical networks with nonuniform losses. We start by providing a generalization of the algorithm from [17] which works for bunched input states, and similarly avoids an exponential cost in the number of modes m.
Theorem 2 (Generalization of the Clifford and Clifford algorithm to collision input states). Consider an n-photon, mmode boson sampling instance described by linear optical transformation U and an input state | S = |s 1 , s 2 , . . . , s m . We do not assume s i ≤ 1. There exist a classical algorithm that generates a sample from the output probability distribution p U S → T in time upper bounded by where α S is the number of nonzero s i 's. Moreover, the algorithm also outputs the probability corresponding to the sampled outcome.
Proof. Let S = (s 1 , s 2 , . . . s m ) encode the input state with s i photons in mode i such that | S| = s i = n. Define similar quantities for the output state T = (t 1 , t 2 . . . t m ). In what follows the initial configuration of photons S is fixed while T denote the possible outcomes of boson sampling experiment.
We split the proof into two parts. First, we recall the results of [31] to give bounds on the computation cost on individual probability amplitudes p U S → T . Then we show, by generalizing the ideas from [17], how to sample from the output probability distribution by sampling the evolution of the individual particles one at the time, sidestepping the need to compute exponentially many individual probabilities. We start by reviewing results of [31]. According to Eq. (1), the probability of some outcome T is proportional to |Per(U ST )| 2 . This permanent can be expanded as follows: where Let us compute the cost of calculating the permanent in this manner. First note there are (s i + 1) terms for each of the v i sums. For each term in that multiple sum we need to compute The product requires computing α T terms (since those with t k = 0 do not contribute), and the sum has α S terms for similar reasons (i.e., if s i = 0 then v i = 0 as well). Since the absolute value of the permanent is invariant under exchanging S and T , computing Per(U ST ) according to the expansion in Eq. (6) has a runtime of For the second part of the proof, we move from the usual notation of second quantization (or mode occupation lists) where a state is represented as a list of occupation numbers [i.e. T = (t 1 , t 2 , . . . t m )], to a notation native to first quantization (or mode assignment lists). In this notation states are represented by multisets of n elements in [m], which we represent as tuples written in nondecreasing order, i.e. z = (z 1 . . . z n ). Each z i represents in which mode photons i is. Mapping back to the previous notation is easy: just identify t i as the multiplicity of mode i in z. We also define µ( z) = m j t j !. More details on the mapping between these descriptions can be found in Appendix A. The first step in constructing the sampling algorithm is to enlarge the sample space. Instead of sampling output states as multisets z, we can sample a tuple r ∈ [m] n and rewrite it in nondecreasing order to obtain the output z. It is easy to show [see Eq.(A4) in Appendix A] that sampling from the distribution and then reordering r is equivalent to sampling from the correct boson sampling distribution. Notice that, in the above equation, we have expanded the sample space for the outputs but not for the inputs, since those are fixed (and of type A or B). We have also defined U S, r in a similar spirit to U S,T just above Eq. (6), but now for each r i in r we take row r i of U S . The claim that sampling from Eq. (8) is equivalent to sampling from the correct distribution follows from the fact that the permanent is invariant under permutations of rows and columns, and so it is equal for all r's that, when reordered, lead to the same z. The second step is to find the marginal probability mass function (pmf) of leading subsequences of (r 1 , r 2 , . . . r n ). The combinatorial factors that appear in the expression for this pmf make it cumbersome to write explicitly. In Appendix A we show how to compute these marginal pmfs, and show that they have two properties crucial for the simulation algorithm (which they share with the corresponding pmfs for no-collision inputs, i.e. Lemma 1 in [17]): (i) the probability of outcome (r 1 , r 2 , . . . r l ), for l ≤ n, is the sum over probabilities corresponding to all inputs consistent with choosing l photons from the actual n-photon input state S, and (ii) each term in that sum corresponds to an input with the same binning structure as S.
Therefore, if for some input S the permanents Per(U ST ) can be efficiently computed, then the same is true for individual marginal probabilities of the distribution p U ( S → r). We now describe the sampling algorithm. This is done by sampling the marginal distributions one at a time. That is, we use the chain rule for conditional probabilities to write p( r) = p(r 1 )p(r 2 |r 1 )p(r 3 |r 1 , r 2 ) . . . p(r n |r 1 , . . . , r n−1 ).
It is easy to see that, for fixed r 1 , sampling r 2 according to p(r 2 |r 1 ) is equivalent to sampling it according to (unnormalized) weights p(r 1 , r 2 ), and similarly for the subsequent r i 's. Therefore, the above expression allows us to sample the elements of (r 1 , r 2 , . . . r n ) in order, one at a time, by computing the corresponding marginal probabilities. All that remains is to determine the running time necessary to generate a sample with the above procedure. There are n steps. In step l, there are m probabilities p(r 1 , . . . , r l ) we need to compute according to the possible values of r l (recall that r 1 to r l−1 are fixed by previous steps). Each probability requires computing some number of permanents, as per property (ii) above [cf. the discussion after Eq. (A6) in Appendix A]. Each of these permanents can be computed in time equal to at most This follows from Eq. (7), and details can be found in Appendix B. Therefore, the runtime of the algorithm can be upper bounded by where N l ( S) is the number of l-photon configurations compatible with the initial state S. Finally, note that n l=1 N l ( S) = m i=1 (s i + 1) − 1, which concludes the proof.
Using the above theorem we now identify two new families of input states for which boson sampling can be efficiently simulated classically. Then, from (5) it follows that there exists an efficient classical algorithm to simulate the corresponding instances of boson sampling. The runtimes of this algorithm for inputs of type A and B are, respectively, If c and k are constants, and in the standard boson sampling assumption that m =O(n 2 ), we have Note that T B is not a consequence only of having O(log n) bins. To see this, suppose we have log n bins with n/ log n photons each. In this case the running time for the computation of a single permanent according to Eq. (10) is which is super-polynomial in n. Therefore, the restriction that inputs of type B have most photons starting in a single mode is important.

V. EFFICIENT SIMULATION OF LOSSY BOSON SAMPLING WITH SOME LOSSLESS INPUTS
We now combine the ideas underlying the classical simulation of lossy boson sampling from [23] with the results from the preceding section to expand the known scenarios for which boson sampling becomes classically simulable. We begin by recalling the notion of total variation distance-the figure of merit typically used to assess the accuracy of boson sampling [2]. The total variation distance between probability distributions {p x } and {q x } is given by Importantly, the total variation distance between probabilities {p ρ x } and {p σ x } generated by measuring two quantum states ρ and σ, respectively, with the same measurement M = {M x } is always upper bounded by the trace distance between ρ and σ, We are now ready to formally state our result.
Theorem 3 (Classical simulation of lossy states when photons in some modes are not lost). Consider an m-mode n-photon input state and a linear-optical transformation Λ = Λ • Λ k,η , where Λ is some (perhaps lossy) linear optical transformation, and Λ k,η is a pure loss channel such that the first k modes have zero loss, while the remaining m − k modes have transmission probability η, i.e. we have Let {p Λ x } be the probability distribution obtained by measuring the output state Λ (|Ψ 0 Ψ 0 |) with particle-number measurements. Then, provided k = O(log n), for any Λ there exist a (weakly) classically simulable distribution {q Λ x } which satisfies Consequently, if η = ω( √ n − k), the error incurred in approximating {p Λ x } with the (classically easy) distribution {q Λ x } goes to 0 as the number of particles n tends to infinity.
Proof sketch. The proof generalizes the main idea from [23] where the classical simulation of lossy linear optics was based on approximating the lossy bosonic state onñ particles by particle-separable states, i.e. states of the form Importantly, state |φ α φ α | ⊗nα can be generated from |n α 0 · · · 0 by application of linear-optical transformation Λ Uα . The lossy state considered in [23] was of the formρ = Λ η (|Φ 0 Φ 0 |), where Λ η is the channel that describes uniform losses [see Eq.(2)] and |Φ 0 Φ 0 | is a standard boson sampling input state onñ particles. It was shown that there exist particle-separable state σ with a probability distribution {P α } that can be sampled efficiently, and such that the trace distance to the target stateρ is upper bounded by From the above and the fact that boson sampling with input states |ψ α ψ α | ⊗nα can be simulated classically it follows that for any linear-optical processΛ applied toρ and terminating with particle-number measurements can also be (weakly) simulated classically within total variation distance∆.
Here we generalize the above argument to the situation at hand. The key difference is that, as the (partially) lossy input state, we take the state ρ = Λ k,η (|Ψ 0 Ψ 0 |) for which no particles are lost in the first k modes. We can approximate ρ by where, in contrast to Eq. (17), states entering in the above convex decomposition are of the type Λ Uα (|Ψ k,α Ψ k,α |), where |Ψ k,α = | k 1 · · · 1, n α , 0 · · · 0 and Λ Uα is a suitable linear optical unitary transformation that acts as the identity on the first k optical modes. If k = O(log n), states of type |Ψ k,α lead to classically easy boson sampling instances (they belong to type B appearing in Theorem 2). Repeating the considerations that led to Eq.(18) we conclude it is possible to choose a distribution {P α } which can be sampled efficiently and for which we have By the assumptions stated above, sampling from Λ (σ) can be done efficiently, which concludes the proof.

VI. CLASSICAL SIMULATION OF LINEAR OPTICS IN UNBALANCED LOSSY NETWORKS
In this Section we combine results from Theorems 1 and 3 to show that, for arbitrary lossy linear networks, boson sampling becomes classically simulable provided there are at most O(log n) input modes for which the input-output paths are short.
Theorem 4 (Classical simulation in unbalanced lossy networks). Consider an m-mode lossy optical linear network N acting on the standard input n-photon state of the form Assume every beamsplitter introduces losses, characterized by transmissivity η, on the modes on which it acts. Moreover, assume that at most O(log n) input are such that their shortest input-output path has length smaller than c log n for some suitable constant c > 0. Then there exist a classically algorithm to sample from a {q N x } that approximates the target output distribution {p N x }, obtained by evolving the input state through network N , to accuracy where η eff = η c log(n) = n −c log(1/η) . In particular, if c > Proof. The result follows directly from Theorems 1 and 3. First, note that at most k input modes have their shortest input-output path of length s i ≤ c log n. Then, by Theorem 1, we can extract from the network a layer of nonuniform losses that, up to relabeling of modes, is of the form Next, note that assumptions of Theorem 3 are met since k = O(log n) and the vector of losses from Eq. (23) directly corresponds to Eq. (15). By directly invoking of Theorem 3 we obtain Eq. (22), which concludes the proof.
Example 1 (Triangular decomposition of linear-optical networks [30]). Consider the standard construction of linearoptical networks given in Fig.2. Assume that every beamsplitter in the network is lossy. The number of beam-splitters photons need to cross, as they propagate from input i (counting from the bottom of the network) scales linearly i. Therefore, a lossy network in this geometry satisfies the assumptions of Theorem 4 and the corresponding boson sampling instance can be simulated as described in the statement of the Theorem.

VII. CONCLUSIONS AND OUTLOOK
We presented a comprehensive study of the effect of nonuniform losses on classical simulability of boson sampling. Our main conclusion is that linear-optical networks for which losses exhibit high degree of non-uniformity cannot evade a classical simulation strategies based on ideas from [22,23]. Our main contributions are twofold. First, in Theorem 1 we established the quantitative relation between the geometric properties of lossy networks and the amount of losses that can be pulled-out to the input of the network. Second, in Theorem 4 we provided sufficient conditions for effective classical simulability of of boson sampling subject to nonuniform losses (that can result, for example, from the complicated network structure). To establish the second result we extended the classical simulation algorithm from [17] to instances of boson sampling with binned inputs. This allowed us to identify two classes of input states for which the simulation of boson sampling becomes efficient: (A) when n input photons occupy a constant number of input modes; (B) when all but O(log n) photons are concentrated on a single input mode, while an additional O(log n) modes contain one photon each.
We believe that our results can help in understanding which linear-optical designs can offer scalable platforms for demonstration of quantum advantage based on boson sampling. Moreover, the classical simulation techniques presented here can be leveraged to benchmark the performance of photonic networks with complicated input states in the near future. We would like to conclude by stating a number of interesting open questions.
First, in a recent work [24] Renema et al. proved that, for typical (i.e. Haar-random) linear-optical interferometers U , boson sampling can be classically simulated if only a constant fraction of input photons are lost and every optical mode suffers from identical losses. The technique used in [24] was based on approximating the lossy output probability distributions by distributions originating from interference of partially distinguishable photons (with increasing degree of indistinguishability). It is an interesting question whether input states of type A or type B can be used to obtain similar results for generic multi-mode interferometers.
Another interesting question is what are the consequences of our result for Gaussian boson sampling [36], a variant of boson sampling that uses squeezed Gaussian states as input. A recent work [37] studied under which conditions lossy Gaussian boson sampling becomes easy to simulate. We conjecture that our results from Section IV can be adapted to the case of Gaussian boson sampling to prove, for example, that it is also classically simulable if the squeezed states are inputs only in k =O(1). We also conjecture that our results of Section VI can be adapted to show that the conclusions of [37] also hold for unbalanced networks (there, the authors made the standard assumption of uniform losses).
Lastly, nonuniform losses clearly seem to undermine the quantum features of nonclassical states. It is natural to explore the consequences of our results regarding extraction of nonuniform losses from a lossy network (Theorem 1) for the nonclassical features of quantum states generated by such networks [38]. elements of r, for l ≤ n, and let r l,C = (r l , r l+1 . . . r n ) denote the remaining elements of the list over which we wish to marginalize. We can write the marginal pmf for that subsequence as P U ( S → r l ) = tr   ϕ U | S S|ϕ † U   | r l r l | ⊗ r l,C | r l,C r l,C |     . (A5) The um inside the trace equals simply to the identity on the subspace of particles l + 1 to n. We can thus write the above as P U ( S → r l ) = tr tr n−k (ϕ U | S S|ϕ † U )| r l r l | = tr U ⊗l (tr n−l | S S|)(U † ) ⊗l | r l r l | , where by tr n−l we mean tracing over the last n−l particles, and we have used the property that tr n−l (U ⊗n ρ(U † ) ⊗n ) = U ⊗l tr n−l (ρ)(U † ) ⊗l , as discussed in [23]. The result we need already follows from Eq. (A6). To see this note that (tr n−l | S S|) is a convex combination over all l-photon Dicke states compatible with choosing l out of n photons from input state | S i.e. states of the form where m i=1 K i = n − l and K i ≤ S i . The probability that a particular state | S − K appears in the decomposition of (tr n−l | S S|) is proportional to the multinomial coefficient n−l K1,K2,...Km . Therefore, Eq. (A6) satisfies properties (i) and (ii) laid out above Eq. (9). The fact that each probability in Eq. (A6) can be computed efficiently for input state of types A and B follows from the discussion in the proof of Theorem 2.