Quantum-inspired permanent identities

The permanent is pivotal to both complexity theory and combinatorics. In quantum computing, the permanent appears in the expression of output amplitudes of linear optical computations, such as in the Boson Sampling model. Taking advantage of this connection, we give quantum-inspired proofs of many existing as well as new remarkable permanent identities. Most notably, we give a quantum-inspired proof of the MacMahon master theorem as well as proofs for new generalizations of this theorem. Previous proofs of this theorem used completely different ideas. Beyond their purely combinatorial applications, our results demonstrate the classical hardness of exact and approximate sampling of linear optical quantum computations with input cat states.


Introduction
The permanent of an m × m matrix A = (a ij ) 1≤i,j≤m is a combinatorial function defined as [1,2]: where S m is the symmetric group over m symbols. While the closely related determinant can be computed efficiently using many methods such as Gaussian elimination, Valiant famously proved the #P-hardness of computing the permanent [3]. Interestingly, the permanent appears in the expression of the output amplitudes of linear optical quantum computations with noninteracting bosons [4,5], as in the Boson Sampling model of quantum computation [6]. This connection has lead to several linear optical proofs of existing and new classical complexity results: computation of the permanent is #P-hard [7], (inverse polynomial) multiplicative estimation of the permanent of positive semidefinite matrices is in BPP NP [8], multiplicative estimation of the permanent of orthogonal matrices is #P-hard [9], and computation of a class of multidimensional integrals is #P-hard [10]. It has also lead to the introduction of a quantum-inspired classical algorithm for additive estimation of the permanent of positive semidefinite matrices [11]. Moreover, an approach inspired by quantum tomography recently showed that (subexponential) multiplicative estimation of the permanent of positive semidefinite matrices is NP-hard [12].
Beyond its importance for complexity theory, the permanent has numerous applications for solving combinatorial problems [1,2] and identities for the permanent have been instrumental in these applications. For example, the MacMahon master theorem [13], which relates the permanent to a coefficient of the Taylor series of a determinant, is an invaluable tool for proving combinatorial identities [14][15][16]. Similarly, Ryser's formula [17] and Glynn's formula [18][19][20] are routinely used to compute the permanent more efficiently than the naive brute-force approach.
While linear optics has been used as a tool to explore the classical complexity of the permanent, previous work suggests that it can also be a useful way to obtain simple proofs of theorems about the permanent: for instance, [5] shows that the permanent of a unitary matrix U lies in the (closed) complex unit disk by expressing |Per(U )| 2 as an output probability of a linear optical sampling computation, while [6] derives simple permanent identities using various representations of the same linear optical sampling computation. This begs the following questions: 1. Can we use linear optics to prove existing remarkable permanent identities, such as the MacMahon master theorem?
2. Can we use linear optics to derive new remarkable permanent identities?
In this work, we show that the answer to both questions is yes. We give quantum-inspired proofs of several important permanent identities in section 5. In particular, we show that the MacMahon master theorem can be understood as two different ways of computing an inner product between two Gaussian quantum states.
We also derive new quantum-inspired identities for the permanent. Our main results are summarized in section 3 and proven in section 6. These include generalizations of the MacMahon master theorem (Theorems 1 and 2), new generating functions for the permanent (Theorem 3) and a formula for the sum of two permanents (Theorem 4).
We use bold math for multi-index expressions (see Table 1 above). We denote by T = {z ∈ C, |z| = 1} the complex unit circle. For all m ∈ N * , all z = (z 1 , . . . , z m ) ∈ C m and all p = (p 1 , . . . , p m ) ∈ N m , we use the notation [z p ] to denote the coefficient of z p = z p 1 1 . . . z pm m in an analytic expression. For all m ∈ N * , all p = (p 1 , . . . , p m ) ∈ N m and all q = (q 1 , . . . , q m ) ∈ N m , we denote by A p,q the matrix obtained from an m ×m matrix A by first repeating its i th row p i times (deleting the row if p i = 0) for all i ∈ {1, . . . , m} and then repeating its j th column q j times (deleting the column if q j = 0) for all j ∈ {1, . . . , m}. By convention, we set the permanent of non-square matrices to 0 and Per(A 0,0 ) = 1.
Let m ∈ N * denote the number of modes. We denote (unnormalized) quantum states using the Dirac ket notation. These are vectors in an infinite-dimensional Hilbert space spanned by the orthonormal Fock basis Hereafter, we label Fock states using p, q, p, q, and k, coherent states using α, β, α, and β, and two-mode squeezed states using λ, µ, λ, and µ. In particular, coherent states are defined as for all α ∈ C, and (unnormalized) two-mode squeezed states as for all λ ∈ C with |λ| < 1. Moreover, cat states are defined as for all α ∈ C.
We make use of the following inner products involving these states: for all p ∈ N, α, β ∈ C, and λ ∈ C with |λ| < 1, which can be readily checked in the Fock basis. Moreover, for all p, q ∈ N m [5,6], whereÛ is a passive linear operation over m modes whose action on the creation operators of the modes is described by the unitary matrix U = (u ij ) 1≤i,j≤m aŝ for all j ∈ {1, . . . , m}. As their name indicates, passive linear operations do not change the total number of photons [22]: for all n ∈ N, where Π n := |p|=n |p p| is the m-mode projector onto states with total photon number equal to n. Passive linear operations map tensor products of coherent states to tensor products of coherent states [22]: for all α ∈ C m , Recall that coherent states form an overcomplete basis: whereÎ is the identity operator over m modes. We will also make use of the following Gaussian inner product: for all λ, µ ∈ C m with |λ k | < 1 and |µ k | < 1 for all k ∈ {1, . . . , m}, and for all passive linear operationÛ over 2m modes, where we have used the overcompleteness of coherent states (14) over 2m modes in the first line, the action of passive linear operations on coherent states (13) in the second line, the overlap between two-mode squeezed states and coherent states (9) in the third line, and where we have introduced in the fourth line the (4m) × (4m) symmetric matrix where for all w ∈ C n , Note that we associate mode 1 with mode m + 1, mode 2 with mode m + 2, and so on, i.e. |λ = m k=1 |λ k , where |λ k is an unormalized two-mode squeezed state over modes k and m + k.
Finally, we note that any m×m matrix B with B ≤ 1, where · is the spectral norm, may be embedded as the top-left submatrix of a (2m) × (2m) unitary matrix U (see [6,Lemma 29] for an explicit construction). We will use this fact multiple times throughout the paper to extend identities proven for unitary matrices to the case of generic matrices.

Boson Sampling
Boson Sampling is a sub-universal model of quantum computation introduced by Aaronson and Arkhipov (AA) in [6], which takes as input a Fock state |1 ⊕ 0 = |1 ⊗n ⊗ |0 ⊗m−n , evolves it according to a passive linear operationÛ over m modes with unitary matrix U , and measures the photon number of each output mode.
With Eq. (10), the probability of detecting p = (p 1 , . . . , p m ) ∈ N m output photons is given by: This model of quantum computation, while not believed to be universal, is already capable of outperforming its classical counterparts: AA showed that the output probability distribution P BS is hard to sample exactly classically for m ≥ 2n, or the polynomial hierarchy of complexity classes collapses to its third level [6]. Moreover, under additional plausible conjectures, AA showed that this collapse holds for m = Θ(n 5 log 2 n) even if only an efficient classical algorithm for approximate sampling exists (i.e. a classical algorithm which samples efficiently from a probability distribution that has a small total variation distance with the ideal Boson Sampling output probability distribution P BS ). They further conjectured that this approximate hardness should hold for m = Θ(n 2 ).
The approximate hardness proof for Boson Sampling is based on a matrix-hiding argument which requires a specific regime of parameters m and n: given δ > 0, the relation between m and n should be such that the distribution H n,m of n × n submatrices of m × m Haar-random unitary matrices multiplied by √ m is O(δ)-close in total variation distance to the distribution G n×n of n × n matrices of i.i.d. Gaussians (see Theorem 35 in [6]). In particular, AA showed that, for any δ > 0, H n,m − G n×n TV = O(δ) when m ≥ n 5 δ log 2 n δ . This result was later refined in [23] with the bound H n,m − G n×n TV ≤ n 3 2(m−n) , which gives a total variation distance O(δ) for m ≥ n 3 δ .
Finally, in the case of orthogonal matrices rather than unitary matrices, it was shown in [24] that (see the equation following Eq. (2.31) in [24], with n → m and p = q → n): where D KL is the Kullback-Leibler distance and where we obtain the second line under the assumption n = o(m). In particular, for δ > 0, choosing m ≥ n 2 δ 2 implies D KL (H n,m , G n×n ) ≤ O(δ 2 ), and thus H n,m − G n×n TV ≤ O(δ) by Pinsker's inequality (see Eq. (1.3) in [24]). It was also argued in [24] that a similar result should hold for unitary matrices.
Summarising these results, approximate hardness of Boson Sampling is proven for pas-

The MacMahon master theorem
The MacMahon master theorem is an important result in combinatorics which relates the permanent to the determinant: Theorem (MacMahon master theorem [13]). Let z = (z 1 , . . . , z m ) be formal variables.
This theorem is particularly useful to derive short proofs of combinatorial identities: it expresses the permanent of an m × m matrix A with rows and columns repeated in the same way as the coefficient where Z = Diag(z) and p ∈ N m , while the same permanent may also be expressed as the coefficient Picking a specific matrix A and a pattern p and computing the above expressions yields combinatorial identities, a famous example being the short proof of Dixon's identity [14,25], n,n,n , by taking and p = (2n, 2n, 2n) for n ∈ N * .
Various generalizations of the MacMahon master theorem have been introduced over the years [26][27][28][29][30]. In physics, this theorem plays an important role in the quantum theory of angular momentum [31] and is also oftentimes interpreted as an instance of the bosonfermion correspondence [27].

Main results
In this section, we summarize our main findings, which we prove in section 6. We obtain the following generalization of the MacMahon master theorem: where X = Diag(x) and Y = Diag(y).
We further show that this generalization extends to N matrices: where As a corollary of Theorem 1, we obtain: For any m × m matrix A and all p, q ∈ N m with |p| = |q| = n ∈ N, As a consequence of Corollary 1, we obtain the following family of generating functions for the permanent: Equivalently, for all n ∈ N and all p, q ∈ N m such that |p| = |q| = n. As a result, when f n = 0, where the average is over random vectors with complex coefficients of modulus 1.
From this theorem we derive various permanent identities, the most notable one being a formula for the sum of two permanents: In particular, when p = q = 1, Section 5 is primarily devoted to quantum-inspired proofs of existing permanent identities, including a new proof of the MacMahon master theorem. Along the way, we obtain the following inner product formula: Lemma 1 has the following direct implications for the hardness of Boson Sampling with input cat states [21], which we discuss in section 7: Theorem 5. Let m = poly n ≥ 2n and α = 0. Boson Sampling with input |cat α ⊗n ⊗ |0 ⊗m−n with α ∈ C is hard to sample exactly classically unless the polynomial hierarchy of complexity classes collapses to its third level. Moreover, assuming |α| = O(n −1/4 log 1/4 m) Boson Sampling with input cat states is as hard to sample approximately as Boson Sampling with input single-photons, i.e. Boson Sampling with input cat states is hard to sample approximately classically unless the polynomial hierarchy of complexity classes collapses to its third level, in the same regime as Boson Sampling, modulo the complexity conjectures introduced by AA in [6].
This result extends the arguments of [21]-which gave formal proof of classical hardness in the exact sampling case-to the approximate sampling case.

Discussion
In the next sections, we introduce quantum-inspired proofs of permanent identities. This approach allows us to give quantum-operational interpretations of seminal results, such as the MacMahon master theorem [13]. In particular, we show that this theorem can be seen as two facets of the same bosonic Gaussian amplitude.
This quantum-inspired approach also yields a breadth of new permanent identities. We give some examples of combinatorial applications of these identities in section 6 and we anticipate that they have many more. Beyond these purely combinatorial applications, it would be interesting to investigate whether our new identities may be used to obtain more efficient classical algorithms for computing or estimating the permanent. In particular, our Theorem 3 provides new estimators for the permanent which could be of interest, by minimizing the variance of these estimators over the choice of the analytic function f .
We use the formalism of linear optics with noninteracting bosons, but our approach can be applied more generally to linear optics with other types of particles. For instance, we expect our proof techniques to lead to determinant identities in the fermionic case [32], immanant identities in the case of partially distinguishable particles [33,34], and additional permanent identities in the case of generalized bosons [35]. Moreover, graphical languages are currently being developed for linear optical quantum computations [36,37], which could lead to graphical proofs of remarkable identities in combination with our approach.
Finally, our Theorem 5 gives solid complexity-theoretic foundations for the hardness of Boson Sampling with input cat states. Generation of such states has progressed tremendously in the recent years, thanks to circuit QED in particular [38][39][40]. We hope that these foundations will motivate an experimental demonstration of quantum speedup based on Schrödinger's cat states.

Quantum-inspired proofs of permanent identities
In this section, we derive quantum-inspired proofs of existing permanent identities. Along the way, we obtain a generalization of Glynn's formula [20] for the permanent of matrices with repeated rows and columns in Eq. (43), as well as a generalization of the Glynn-Kan formula [41] for the permanent of matrices with repeated rows and columns in Eq. (54).

Glynn's formula
Glynn's formula for the permanent of an m × m matrix A = (a ij ) 1≤i,j≤m is [20]: By symmetry, it is equivalent to the identity Proof of Glynn's formula. To prove this identity using quantum-mechanical tools, let us introduce the unnormalized cat state c at α := 1 2α (|α − |−α ), for α ∈ C. Using the Fock basis expansion of coherent states (6), we have lim α→0 c at α = |1 in trace distance. Hence, using the Fock basis expansion ofÛ (10), whereÛ is a passive linear operation over m modes with unitary matrix U = (u ij ) 1≤i,j≤m . We compute the right hand side of this equation: where we used the action ofÛ on coherent states (13) in the second line, the Fock basis expansion of coherent states (6) in the third line, and the fact that U is unitary in the last line. Taking the limit when α → 0 yields Glynn's formula for unitary matrices.
With the same calculations, we may obtain a more general version of Eq. (36): for all α ∈ C and p ∈ N m , Note that when |p| < m all products in the sum have at least one x i missing. Hence, by symmetry, p 1 . . . p m |Û |c at α . . .c at α = 0 when |p| < m. With the Fock basis expansion ofÛ (10), taking the limit when α → 0 yields a version of Glynn's formula for unitary matrices with repeated rows: where δ is the Kronecker symbol. So far, all identities are derived for unitary matrices U . In order to retrieve the same identity for a generic matrix A = (a ij ) 1≤i,j≤n of size n, we can embed 1 A A (or A directly if A = 0) as a submatrix of a unitary matrix U of size m = 2n [6, Lemma 29] and compute: To do so, we compute a slightly more general inner product: for α ∈ C, p ∈ N m and n ≤ m, In particular, setting m = 2n and p = q ⊕ 0 for q ∈ N n , Choosing u ij = 1 A a ij for 1 ≤ i, j ≤ n and letting α go to 0 proves the claim: the left hand side converges to 1 A similar generalization of Glynn's formula for matrices with repeated rows (or columns) based on roots of unity has previously appeared in [42]. Moreover, for any m×m matrix A = (a i,j ) 1≤i,j≤m and all p, q ∈ N m with |p| = |q| = n, we have the formula which is a version of Glynn's formula for matrices with repeated rows and columns. This formula, which has previously appeared in [43][44][45], can be easily proven by expanding the product (Ax) p = m i=1 m j=1 a ij x j p i and using properties of roots of unity. We give a quantum-inspired proof in what follows. For all n ≥ 1, all q ≤ n, and all α ∈ C, we define the following unnormalised superposition of coherent states: Using the definition of coherent states (3), we obtain where we used n−1 k=0 e 2ik(p−q)π n = n if p = q mod n and 0 otherwise in the second line. Hence, lim α→0 |σ q,n α = |q in trace distance. Using the Fock basis expansion ofÛ (10), we thus have, for all p, q ∈ N m with |p| = |q| = n ≥ 1, whereÛ is a passive linear operation over m modes with unitary matrix U = (u ij ) 1≤i,j≤m . We compute the right hand side of this equation: where we used the definition of |σ q,n α (44) in the first line and where the derivation for the last line is identical to that of Eqs. (36)(37). Letting α go to 0 proves Eq. (43) for unitary matrices. Finally, the case of a generic nonzero m × m matrix A is obtained as in Eq. (40) by embedding 1 A A as a submatrix of a unitary matrix U of size 2m and computing

The Glynn-Kan formula
A symmetrized version of Glynn's formula for the permanent of an m × m matrix A has been recently derived [41] under the name Glynn-Kan formula: While the Glynn-Kan formula gives a slower classical algorithm than the Glynn formula for computing the permanent, it is motivated by a connection with quantum algorithms for estimating the permanent [41].
Proof of the Glynn-Kan formula. To prove this identity using quantum-mechanical tools, we again use the unnormalized cat state c at α := 1 2α (|α −|−α ), as in the previous section. Using the Fock basis expansion of coherent states (6), we have lim α→0 c at α = |1 in trace distance. Hence, using the Fock basis expansion ofÛ (10), x,y∈{−1,1} m where we used the action ofÛ on coherent states (13) in the third line, the overlap between coherent states (7) in the fourth line, and the fact that U is unitary in the fifth line. For k < m, all products in the expansion of (x T U y) k have at least one x i missing. Hence, by symmetry, the terms for k < m vanish and we obtain c at α . . .c at α |Û |c at α . . .c at α = e −m|α| 2 4 m x,y∈{−1,1} m Finally, taking the limit when α goes to 0 yields This proves the Glynn-Kan formula for a unitary matrix U . Once again, the proof extends straightforwardly to any matrix A of size n by embedding 1 A A as a submatrix of a unitary matrix U of size 2n and computing lim α→0 ( c at α ⊗n ⊗ 0| ⊗n )Û ( c at α ⊗n ⊗ |0 ⊗n ).
Similar to Eq. (43), we obtain a version of the Glynn-Kan formula for matrices with repeated rows and columns using roots of unity as for any m × m matrix A = (a ij ) 1≤i,j≤m and all p, q ∈ N m with |p| = |q| = n. This formula, which appears to be new, can be easily proven using our Corollary 1 by expanding the product (x T Ay) n and using properties of roots of unity. We give a quantum-inspired proof in what follows. For all n ≥ 1, all q ≤ n, and all α ∈ C, we again use the unnormalised superposition of coherent states as in Eq.
whereÛ is a passive linear operation over m modes with unitary matrix U = (u ij ) 1≤i,j≤m . We compute the right hand side of this equation:

The Cauchy-Binet theorem
The Cauchy-Binet theorem for the permanent expresses the permanent of the product of two m × m matrices A and B as a sum of products involving permanents of submatrices of these two matrices [1,2]: for all p, q ∈ N m , Proof of the Cauchy-Binet theorem. Using the Fock basis expansion of passive linear operations (10) gives a quick quantum-inspired proof of this identity: for U and V two m×m unitary matrices and for all p, q ∈ N m , where we used the fact that Fock states form a basis in the third line and where we used Eq. (10) once in the first line and twice in the last line. In order to retrieve the formula for generic matrices A and B, we can embed 1 A A as a submatrix of a unitary matrix U and 1 B B as a submatrix of a unitary matrix V and compute Per((U V ) p⊕0,q⊕0 ).

Generating functions
The permanent may be seen as a monomial coefficient in the Taylor expansion of various functions [1,2]. For example, the permanent of an m × m matrix A = (a ij ) 1≤i,j≤m with rows repeated according to p ∈ N m and columns repeated according to q ∈ N m is given by the coefficient of z This may be thought of as the 'monomial version' of Glynn's formula in Eq. (42). Formally: for formal variables z = (z 1 , . . . , z m ).

Proof of 'Glynn's monomial formula'.
To prove this relation with quantum mechanical tools, we fix p ∈ N m , α ∈ C m , a unitary matrix U of size m, and we compute: where we used the Fock basis expansions of coherent states (6) and ofÛ (10) in the first line, the fact that Fock states form a basis in the second line, the action ofÛ on coherent states (13) in the third line, Eq. (6) again in the fourth line, and the fact that U is unitary in the last line. Once again, the relation for a generic nonzero matrix A is obtained by embedding 1 A A as a submatrix of a unitary matrix U .
Another generating function for the permanent is due to Jackson [46]: for formal variables x = (x 1 , . . . , x m ), y = (y 1 , . . . , y m ), Note that this relation implies Corollary 1, by expanding the Taylor series of the exponential and considering the x p y q coefficient when |p| = |q| = n ∈ N * (an alternative proof of this result is given in the next section): Proof of Jackson's formula. Jackson's formula may be derived using quantum mechanical tools in a similar way: for α, β ∈ C m and a unitary matrix U , where we used the Fock basis expansions of coherent states (6) and ofÛ (10) in the first line, the fact that Fock states form a basis in the second line, the action ofÛ on coherent states (13) in the third line, the overlap between coherent states (7) in the fourth line, and the fact that U is unitary in the last line. Once again, the relation for a generic nonzero matrix A is obtained by embedding 1 A A as a submatrix of a unitary matrix U .
We conclude this section with arguably one of the most remarkable permanent identities, the MacMahon master theorem [13], which relates the permanent and the determinant through a generating function (see section 2.3): where Z = Diag(z), with z = (z 1 , . . . , z m ) formal variables.

Proof of the MacMahon master theorem.
To prove this relation with quantum mechanical tools, we show that the MacMahon master theorem describes two different ways of computing an inner product between two Gaussian states.
For λ ∈ C and |λ| < 1, we make use of (unnormalized) two-mode squeezed states of the form |λ = p≥0 λ p |pp . We write |λ with λ = (λ 1 , . . . , λ m ) ∈ C m a tensor product of m two-mode squeezed states (note that we are associating mode 1 with mode m + 1, mode 2 with mode m + 2, and so on). For U a unitary matrix and λ, µ ∈ C m , with |λ k | < 1 and |µ k | < 1 for all k ∈ {1, . . . , m}, we compute, using the Fock basis expansion ofÛ (10): Using the Fock basis expansion of two-mode squeezed states (8), we have Moreover, a quick computation in Fock basis shows that Hence, Eq. (67) rewrites where in the last line we used λ * | p∈N m |pp pp| = λ * |, which can be checked in Fock basis.
We now compute the Gaussian inner product λ * |Î ⊗Û |µ . From the action of passive linear operations on creation operators (11), the unitary matrix associated to the passive linear operationÎ ⊗Û is I ⊕ U . With the Gaussian overlap derived in Eqs. (15)(16)(17), we thus obtain: where where for all w ∈ C m , We have  71) and (74), we obtain, λ, µ ∈ C m , with |λ k | < 1 and |µ k | < 1 for all k ∈ {1, . . . , m}, where we used the value at λ = µ = 0 and the fact that the left hand side is a continuous function of λ and µ to determine the sign of the square root. Replacing (λ 1 µ 1 , . . . , λ m µ m ) by the formal variables z = (z 1 , . . . , z m ) concludes the proof. The relation for a generic nonzero matrix A of size n is obtained by embedding 1 A A as a submatrix of a unitary matrix U of size 2n and taking z = (z 1 , . . . , z n , 0, . . . , 0).

New quantum-inspired permanent identities
In this section, we derive new quantum-inspired identities for the permanent and we give some combinatorial applications of these identities.

Generalizations of the MacMahon master theorem
In this section, we introduce new generalizations of the MacMahon master theorem (see section 2.3).
Proofs of Theorem 1, Corollary 1, and Theorem 2. In section 5.1, we have obtained a linear optical proof of Glynn's formula for the permanent using the fact that a cat state |cat α of small amplitude α approximates a single-photon Fock state |1 . Hereafter, we consider another approximation of Fock states using superpositions of two-mode squeezed states. Such superpositions were recently studied in [47] in the context of quantum sensing.
To prove the identity in Eq. (76), we introduce |λ − := 1 2λ (|λ − |−λ ), for |λ| < 1, where |λ = p≥0 λ p |pp is an unnormalized two-mode squeezed state. We have lim λ→0 |λ − = |11 in trace distance. Hence, for U a (2m) × (2m) unitary matrix we have with the Fock basis expansion ofÛ (10): For all λ ∈ R, with |λ| < 1, let us compute where we have defined for all λ, µ ∈ C m (note that we associate mode 1 with mode m + 1, mode 2 with mode m + 2, and so on) With the Gaussian overlap from Eq. (15), we have where where for all w ∈ C m , Moreover, so with Eq. (85) Setting λ = µ = λ1 for λ ∈ R and letting λ go to 0 in Eq. (83), we obtain that Per(U ) is given by where [λ 2m ]g U (λx, λy) is the λ 2m coefficient in the Taylor expansion (in λ) of This concludes the proof of Eq. (76). Once again, the relation for a generic nonzero matrix A is obtained by embedding 1 A A as a submatrix of a unitary matrix U . Now with Eq. (84), for all λ, µ ∈ C n , where we used the Fock basis expansion ofÛ (10) where for all w ∈ C n , This concludes the proof of Eq. (77). Once again, the relation for a generic nonzero matrix A is obtained by embedding 1 A A as a submatrix of a unitary matrix U . When U = A ⊕ B, with A and B two m × m matrices, we have U p⊕p,q⊕q = A p,q ⊕ B p,q and the permanent of a block-diagonal matrix is the product of the permanents of the blocks, so Eq. (93) gives p,q∈N m x p y q p!q! Per(A p,q )Per(B p,q ) = 1 Writing X = Diag(x 1 , . . . , x m ) and Y = Diag(y 1 , . . . , y m ), we have This theorem reduces to the MacMahon master theorem when A = I or B = I, since Per(I p,q ) = p!δ p,q . Another particular case of interest is when B = J m , where J m is the all-1 matrix of size m × m, which satisfies Per(J m ) = m!. In this case, we obtain, for all x, y ∈ C m , n∈N,p,q∈N m |p|=|q|=n n!x p y q p!q! Per(A p,q ) = 1 Det(I − AY J m X) where we used the fact that Ayx T is a rank-one matrix to compute the determinant. This implies Corollary 1: for all n ∈ N * and all p, q ∈ N m with |p| = |q| = n, where we used the Taylor series 1 1−z = +∞ k=0 z k . We note that Theorem 1-which is a generalization of the MacMahon master theorem to two matrices-may also be obtained by combining the MacMahon master theorem in Eq. (66) with the Cauchy-Binet theorem in Eq. (58), applied to the matrix AY B T . As it turns out, we may apply the same proof technique inductively in order to generalize the MacMahon master theorem to N matrices A (1) , . . . , A (N ) , for N ≥ 1 and obtain Theorem 2: assuming that Eq. (81) holds for some N ≥ 2, we have, for all m × m matrices B (1) , . . . , B (N ) , where z k = (z k1 , . . . , z km ) are formal variables and Z k = Diag(z k ) for all k ∈ {1, . . . , N }. Let A (1) , . . . , A (N +1) be m × m matrices, z N +1 = (z N +1,1 , . . . , z N +1,m ) be formal variables and Z N +1 = Diag(z N +1 ). Setting B (N ) = A (N ) Z N +1 A (N +1) and B (k) = A (k) for all k ∈ {1, . . . , N }, we obtain with Eq. (99): (100) On the other hand, with the Cauchy-Binet theorem from Eq. (58), where we used the fact that multiplying any single row of M by a variable z changes Per(M ) to zPer(M ). Combining Eqs. (100) and (101) completes the induction step and the proof of Theorem 2.
Applications. Recall that the MacMahon master theorem is a useful tool for deriving combinatorial identities (see section 2.3): it expresses the permanent of an m × m matrix A with rows and columns repeated in the same way as the coefficient where Z = Diag(z) and p ∈ N m , while the same permanent may also be expressed as the coefficient Our generalizations of the MacMahon master theorem may be used in a similar fashion to obtain simple proofs of combinatorial identities. Let us illustrate this with an example: for n ∈ N and a, b ∈ C, setting we aim to prove the relation which for a = b = 1 directly implies the well-known n k=0 n k 2 = 2n n . Proof. By Theorem 1 we have where X = Diag(x 1 , . . . , x m ) and Y = Diag(y 1 , . . . , y m ). Setting for a, b ∈ C, n ∈ N * , and p = q = (n, n), we obtain with Eq. (103): On the other hand, with Eq. (106) we have where we used the Taylor expansion of 1 1−z in the third line and the multinomial theorem in the last line. The indices in the above expression must satisfy n 11 = n 22 , n 12 = n 21 , which implies j = 2n 11 + 2n 12 , i.e. j is even. Moreover, n = k − j + n 11 + n 12 = k − j/2. Relabeling j = 2l and n 11 = p we have k = l + n and we obtain With Eq. (108) this proves

New generating functions
We have encountered several generating functions for the permanent of an m × m matrix A with differently repeated rows and columns. Jackson's formula gives [46]: for formal variables x = (x 1 , . . . , x m ), y = (y 1 , . . . , y m ). Moreover, we have obtained in Eq. (97), as a corollary of our generalization of the MacMahon master theorem: Furthermore, Corollary 1 gives: In this section we prove Theorem 3, which generalizes all three statements: for any series One may prove this statement by linearity, using Eq. (114). In what follows, we give a direct quantum-inspired proof which is similar to that of Jackson's formula from section 5.4.
where we used the Fock basis expansions of coherent states (6) and ofÛ (10) in the first line, the definition of the projector Π n = |p|=n |p p| in the second line, the fact that passive linear operations conserve the total number of photons (12) in the third line, and the action ofÛ on coherent states (13) in the fourth line. Now, where we used the Fock basis expansion of coherent states (6) in the second line, the fact that U is unitary in the third line, and the multinomial theorem in the last line. Combining Eqs. (116) and (117) completes the proof for unitary matrices. Once again, the relation for a generic nonzero matrix A is obtained by embedding 1 A A as a submatrix of a unitary matrix U .
As a result, when f n = 0 we have, for any m × m matrix A, all n ∈ N and all p, q ∈ N m such that |p| = |q| = n, where we have set x = (e iθ 1 , . . . , e iθm ) ∈ T m and y = (e iϕ 1 , . . . , e iϕm ) ∈ T m in the first line, and where we have used 2π 0 e i(k−l)θ dθ 2π = δ kl in the third line.
Applications. These generating functions may be used to derive remarkable identities for the permanent: for instance, for A and B two m × m matrices, taking f (z) = e z and equating the x p y q coefficients of e x T (A+B)y and e x T Ay e x T By yields the sum formula [1,2]: for all p, q ∈ N m . Similarly, for A an m × m matrix, taking f (z) = z p for p = k, l, k + l and equating the x p y q coefficients of (x T Ay) k+l and (x T Ay) k (x T Ay) l yields the Laplace expansion formula [1,2]: for all k, l ∈ N and all p, q ∈ N m with |p| = |q| = k + l. Finally, for A and B two m × m matrices, taking f (z) = − log(1 − z) and equating the yields Theorem 4: after a derivation which we detail below.
Proof of Theorem 4. Let us write Applying Theorem 3 for f (z) = − log(1 − z) with z = x T By, x T Ay, x T (A+B−Byx T A)y and considering the x p y q coefficient we obtain for all n ∈ N * and all p, q ∈ N with |p| = |q| = n: Setting |p| = n and replacing unnormalized cat states by normalized ones in Eq. (129), we obtain Lemma 1: where we used Eq. (42) in the second line and the Fock basis expansion ofÛ (10) in the last line.
Interestingly, Lemma 1 implies that, up to a global factor, n cat states of amplitude α reproduce exactly the Boson Sampling statistics of n single photons (however, detection events p with |p| > n can occur for input cat states, contrary to single-photons).
This result has two consequences, summarized by Theorem 5. Firstly, it implies that Boson Sampling with input cat states is hard to sample exactly for m ≥ 2n and for all choices of nonzero cat state amplitude unless the polynomial hierarchy collapses to its third level, since some of its outcome probabilities are #P-hard to estimate multiplicatively [6]. Alternative proofs of this statement based either on universality under post-selection or on rejection sampling can be found in [21]. Secondly, as we show hereafter, it implies that Boson Sampling with input cat states of small enough (nonzero) amplitudes is also hard to sample approximately in the same regime as Boson Sampling.
Proof of Theorem 5. Our proof uses arguments similar to the ones used in [21], extended to the case of approximate sampling.
At a high level, we show how to convert any classical algorithm for approximate sampling from the output probability distribution of an m-mode Boson Sampling computation with n input cat states to a classical algorithm for approximate sampling from the output probability distribution of an m-mode Boson Sampling computation with n input Fock states (Lemma 2). We do so by performing rejection sampling: we run the first algorithm and only keep the samples with total photon number equal to n. In particular, we show that if the fraction of samples with correct photon number n is large enough (namely, at least inverse polynomial in m), the rejection sampling subroutine is efficient and the final classical probability distribution is close to the ideal distribution of Boson Sampling with Fock state input. Then, we determine the regime of input cat state amplitude such that the fraction of samples with correct photon number n is at least inverse polynomial in m (Lemma 3).
Let P cat (p|n, α) := | p 1 . . . p m |Û (|cat α ⊗n ⊗ |0 ⊗m−n )| 2 be the probability of detecting p = (p 1 , . . . , p m ) output photons in a Boson Sampling experiment with interferometer U and input cat states |cat α ⊗n ⊗ |0 ⊗m−n , and let P BS (p|n) := | p 1 . . . p m |Û (|1 ⊗n ⊗ |0 ⊗m−n )| 2 be the probability of detecting p = (p 1 , . . . , p m ) output photons in a Boson Sampling experiment with interferometerÛ and input single photons |1 ⊗n ⊗ |0 ⊗m−n . We show the following reduction: as long as the fraction of samples with photon number n is large enough, then any efficient classical algorithm for approximate sampling from P cat can be converted to an efficient classical algorithm for approximate sampling from P BS .

Lemma 2.
Suppose there exists an efficient classical algorithm C for approximate sampling from P cat , i.e., that takes as input the description of the m-mode boson sampler with n input cat states with output probability distribution P cat and an error bound , and that samples from a distribution P C such that P C − P cat TV ≤ in poly (m, 1 ) time, where · TV is the total variation distance. Assume further that the fraction of samples from P cat with photon number equal to n is at least inverse polynomial in m, i.e. |p|=n P cat (p|n, α) ≥ 1 poly m . Then, there exists an efficient classical algorithmC for approximate sampling from P BS , i.e., a classical algorithm that takes as input the description of the m-mode boson sampler with n input Fock states with output probability distribution P BS and an error bound , and that samples from a distribution P rej C such that P rej C − P BS TV ≤ in poly (m, 1 ) time.
Proof of Lemma 2. Given a sampling algorithm from a probability distribution P over N m , we define the following rejection sampling subroutine: sample p from P and compute |p|; discard the sample if |p| = n; otherwise, output p. We denote by P rej the corresponding output probability distribution. For all p ∈ N m , we have P rej (p) = δ |p|,n P (p) where δ is the Kronecker symbol. Applying this subroutine to P cat gives a new probability distribution P rej cat (p|n, α) = δ |p|,n P cat (p|n, α) |q|=n P cat (q|n, α) .
This reproduces the argument for exact sampling hardness from [21].
To show approximate hardness, we define the classical approximate sampling algorithm C by running the algorithm C with input error bound 2 |p|=n P cat (p|n, α) and applying the rejection sampling subroutine. By assumption, running the algorithm C can be done in time poly m, 2 |p|=n P cat (p|n, α) since |p|=n P cat (p|n, α) ≥ 1 poly m , also by assumption. On the other hand, the rejection sampling subroutine induces a computational overhead scaling with the inverse of the fraction of kept samples |q|=n P C (q). We have |q|=n P cat (q|n, α) − |q|=n P C (q) ≤ |q|=n P C (q) − |q|=n P cat (q|n, α) where we used the triangle inequality in the second line, the definition of the total variation distance in the fourth line, and the definition of C in the last line. Hence, so that the computational overhead induced by the rejection sampling subroutine scales as poly(m, 1 ). Overall, the classical algorithmC outputs samples from the probability distribution P rej C in poly(m, 1 ) time (with probability exponentially close to 1). Moreover, where we used the definition of the total variation distance in the first and eighth lines, the fact that P rej C and P BS are supported on samples with total photon number n in the second line, Eq. (136) in the third line, Eq. (132) in the fourth line and the triangle inequality in the fifth line.
Hence,C is an efficient classical algorithm for approximate sampling from P BS .
We are left with identifying the regime of input cat state amplitude α ∈ C such that the assumption in Lemma 2 is satisfied, i.e. |p|=n P cat (p|n, α) ≥ 1 poly m . Combining Lemma 2 and Lemma 3, using any efficient classical algorithm for approximate sampling from the output probability distribution of a Boson Sampling instance with input cat states with 0 < |α| = O(n −1/4 log 1/4 n), one may also efficiently sample approximately from the output probability distribution of the corresponding Boson Sampling instance with input single photons efficiently, by keeping only the samples p satisfying |p| = n. This proves that Boson Sampling with input cat states with 0 < |α| = O(n −1/4 log 1/4 n) is hard to sample approximately in the same regime as Boson Sampling with input Fock states, assuming the same conjectures as in [6].