Local and scalable detection of genuine multipartite single-photon path entanglement

How can a multipartite single-photon path-entangled state be certified efficiently by means of local measurements? We address this question by constructing an entanglement witness based on local photon detections preceded by displacement operations to reveal genuine multipartite entanglement. Our witness is defined as a sum of three observables that can be measured locally and assessed with two measurement settings for any number of parties $N$. For any bipartition, the maximum mean value of the witness observable over biseparable states is bounded by the maximum eigenvalue of an $N\times N$ matrix, which can be computed efficiently. We demonstrate the applicability of our scheme by experimentally testing the witness for heralded 4- and 8-partite single-photon path-entangled states. Our implementation shows the scalability of our witness and opens the door for distributing photonic multipartite entanglement in quantum networks at high rates.


Introduction
The generation, distribution and certification of entanglement in multipartite quantum communication networks is of increasing importance as the size and complexity of networks grow beyond simple shortdistance point-to-point scenarios [1,2]. In general, multipartite entanglement enables applications such as enhanced sensing [3,4] or multi-user quantum communication protocols [5,6]. At the heart of the matter, the challenge is to find scalable solutions to realize these applications, which still remain experimentally feasible. On the one hand, the experimental limitations of probabilistic multi-photon sources, especially in terms of rates [7], represent fundamental obstacles for entangled state generation. On the other hand, for the state certification, the exponential scaling of measurements in tomography as the number of parties increases [8] makes it impractical already for a small number of parties.
In the case of generation and distribution of entanglement already for quantum repeaters, a shift away from photon-pair to heralded single-photon entanglement provided significant scaling benefits even for point-to-point communication schemes [9]. For example, as shown in Fig. 1(a), the distribution of entanglement between two remote parties through optical fiber can be realized efficiently by giving each party a source emitting signal-idler photon pairs, combining the idler modes into a beam splitter at a central station and placing two detectors at the output of the beam splitter. A photon detection by one of the two detectors heralds the sharing of a single photon between the signal modes -a single-photon pathentangled state [10,11]. Although the realization of such schemes faces the challenge of active stabilization of the phase between the two parties [12], work addressing this issue has been reported [13][14][15][16][17][18], even over longer distances in optical fiber [19,20].
More interestingly, this approach can be efficiently extended to the distribution of entanglement between multiple parties by simply replacing the twoport beam splitter by a multi-port beam splitter, see Fig. 1(b). This represents an efficient way of gen- (a) Entanglement is distributed between two parties, each having a signal-idler photon pair source (S). The idler modes are combined on a beam splitter and the detection of a single photon after this beam splitter projects the signal modes into a single-photon entangled state. (b) Generalization of the scheme to tripartite states while keeping the local losses low. The aim of this work is to clarify on how entanglement can be detected in this setting. (c) Conceptual schematic of the experiment. Entanglement is distributed among several parties by locally splitting the signal mode into multiple spatial output modes. erating a multipartite entangled state close to a W state [21]: |0, ..., 0, 1 i , 0, ..., 0 with a single photon detection heralding its successful distribution remotely. Here, |0 denotes the vacuum state, |1 the single-photon number state and N the number of parties. Applications of this class of distributed states include long-baseline telescopes that can take advantage not only of a distribution over long distances but also the multipartite setting [22][23][24]. The certification of entanglement in such multipartite quantum communication networks is extremely challenging. Beyond tomography, even typical entanglement witnesses require multiple settings per party [25,26], with the commensurate scaling that quickly ensures their infeasibility. Others have used certification techniques that require the recombination of optical modes [27,28], which are impractical in communication scenarios where local measurements are required. These problems are further exacerbated in a distributed setting, where noisy and lossy channels also need to be addressed.
Here we develop an entanglement witness tailored to the W state, given in Eq. (1), to reveal genuine multipartite entanglement (GME) without postselection. We assume that the measurement apparatus is well characterized and that each party i holds a single optical mode with an associated bosonic annihilation operator a i . There are no further assumptions. In particular, the photon number statistics are unknown and we do not assume that the same state is prepared in each run (i.i.d.). The witness is scalable as it only requires two different measurement settings, independently of the number of parties. Its applicability is demonstrated using an experimental setup in a configuration like in Fig. 1(c) where genuine 4-and 8-partite entangled states are heralded at high rates and successfully verified.

Theory
To certify the GME of N -partite single-photon pathentangled states, we build a witness using practical single-photon detectors, i.e. non-unit efficiency and non-photon number resolving. Such a detector can be modeled as a loss channel with transmission η (the detector efficiency) followed by a two-outcome measurement that perfectly distinguishes the vacuum |0 from all the other Fock states of the detected mode. In this model, the fixed loss can be interpreted as part of the state preparation degrading its entanglement. In the following, we therefore model the detector operating on party i with the positive operator valued measure (POVM) corresponding to the POVM element associated to a click (c) (no-click (0)) event and Π (i) 0 = |0 0| is the projection on the vacuum. With such a detector, we thus have access to the weight of the vacuum component for each party from the no-click events, i.e. from tr(E i 0 ρ) where ρ denotes the N -mode state produced in the actual experiment. By probing each mode with such a detector, we can access the probability that more than one mode contain photons n≥2 P n click , where P n click is the probability that n detectors click. Let us define the POVM elements associated with these probabilities as E n≥2 and E n . Note that p 0 = P 0 click = tr(|0 0| ⊗N ρ) is the probability that the state contains no photons. Furthermore, by probing the mode i with two such detectors after a 50/50 beam splitter, we can upper bound the probability that it contains more than one photon [26], that is tr(Π This allows us to upper bound the probability that the Nmode state ρ contains two or more photons, which we write as an operator inequality where Π n≥2 is the projector on all the combinations of Fock states containing at least two photons in total. We then denote . In order to implement measurements that are sensitive to the coherence between different products of Fock states, the party i can perform a phase-space displacement operation D(α i ) = e αia † i −α * i ai right before the detector [29]. As shown in Appendix A, the loss after the displacement operation, i.e. the detector inefficiency, can be permuted with the displacement by only adjusting the displacement amplitude α i . This allows us to keep the POVM above and define a parametric family of local observables for each party i ∈ {1, ..., N } [30] by attributing the value +1 to a no-click and −1 to a click event. In principle, asking each party to perform such a measurement and combining the results allows us to define a global observablê where α = (α 1 , . . . , α N ). To simplify the experimental realization, we consider the case where the local oscillators used by each party to perform the displacement operations are not phase-locked to the input state, so that the local displacements α i → α i e iϕ are only defined up to an arbitrary common phase ϕ, but the phase differences between the parties are well controlled and kept constant at zero. As a consequence, we measure the observablê (5) Because of the phase averaging, the operatorÔ α = ∞ n=0Ô (n) α acts orthogonally on different total photon number subspaces n = N i=1 a † i a i . For displacement amplitudes chosen in the appropriate range [30], each two-body correlator σ . This coherence is symptomatic of GME in the state |W N , making the observableÔ α a natural candidate to witness this entanglement. However,Ô α is also sensitive to all higher photon number contributions, which are difficult to characterize. A simple way to circumvent this problem is to subtract the observable N (N − 1)Π n≥2 fromÔ α . On the one hand, for our state tr(Π n≥2 ρ) ≈ 0 and the expected value is not affected much by the subtraction. On the other hand, by an operator supported on a subspace with at most one photon.
We can now define the entanglement witnesŝ where M n≤1 is an operator diagonal in the product Fock basis and acting on the subspace with one photon at most, specified in Appendix B, and µ is a positive real parameter that one can tune. To show that the witness can reveal GME, let us start by computing the biseparable bound.
i.e. the maximum value the witness takes on any biseparable state. A general biseparable state is a mixture of states that are product states for some bipartition (a partition of all modes into two groups). Formally, Here, the sum runs over all partitions G 1 |G 2 of the N parties where G 1 ∪ G 2 = {1, 2, . . . , N } and G 1 ∩ G 2 = ∅. The probabilities of different partitions sum up to one G1|G2 p(G 1 |G 2 ) = 1 and ρ G1|G2 is a separable state with respect to the partition G 1 |G 2 . Since the set of biseparable states is convex, the maximum value that an observable takes on any biseparable state bisep , including mixed states, is attained for a pure state |Ψ = |Ψ 1 G1 |Ψ 2 G2 on some partition. Now, using Ineq. (6) we obtain a relaxation (12) which simplifies the maximization problem enormously. Indeed, the operator W is block diagonal, with restriction to the sector with two or more photons −µE n≥2 that is negative. Thus, we can restrict the maximization to states |Ψ k G k which contain one photon at most and write For a fixed partition, the product states can be parametrized by normalized vectors v (1) and v (2) , that can be taken to be real without loss of generality. This leaves us with a N -parameter optimization problem. In Appendix B, we show that for our witness this maximization can be reduced to a single parameter optimization of the maximum eigenvalue of a N × N matrix M(λ, µ, α, a), which can be solved efficiently with standard numerical tools. Here, λ (like µ) is a positive real parameter of the witness that can be tuned. We solve the optimization for all bipartitions to obtain a relaxation of the biseparable bound for the witnessŴ It remains to explain how we estimate the violation of the witness Ŵ α − w bisep on the multimode state ρ prepared in the experiment. In reality, the amplitudes α of the displacements fluctuate within some range α ∈ A that we characterize. The operatorŴ α depends on these amplitudes both "physically" via the observableÔ α , but also "algebraically" via the definition of the operator M n≤1 (α) and the value for the biseparable bound w bisep (α). To remove the second dependence, we consider the worst-case scenario w max bisep = max α∈A w bisep (α) and M n≤1 = min α∈A M n≤1 (α). Then, defining the operator W α , where we replace M n≤1 byM n≤1 in the witness (Eq. (7)), implieŝ To prove GME it suffices to show that the average (over all rounds with fluctuating α) expected value of W α on ρ, which we call w max ρ , exceeds the constant w max bisep . As argued in Appendix C, W α can be  Figure 2: Experimental schematic: A heralded single photon, incident on a cascade of 50/50 fiber beam splitters (BS), is delocalized over spatial modes to generate an 8-partite path-entangled state. Weak coherent states in orthogonal polarization modes are co-propagated with the single-photon state to locally perform displacement-based measurements. See main text for details on the setup and notation. estimated by combining the average values of three different observables measured independently in different runs of the experiment. These are,Ô α measured with displacement operations, Z as defined in Eq. (49) and measured without displacing and one detector per mode, and Σ n≥2 = i Π (i) ni≥2 measured on a single mode with a 50/50 beam splitter and two detectors. Finally, in Appendix D we analyze the statistical significance of the observed violation of the witness. We use Hoeffding's theorem (1963) [31] to upper bound the p-value for the null-hypothesis that the state ρ is biseparable.

Experiment
The experimental setup is presented in Fig. 2. We use a heralded single photon source (HSPS) employing a periodically poled potassium titanyl phosphate (PP-KTP) nonlinear crystal as a type-II spontaneous parametric down-conversion (SPDC) source. The crystal is pumped by a Ti:Sapphire laser at λ p = 771.7 nm in the picosecond pulsed regime with a repetition rate of 76 MHz to create nondegenerate photon pairs at λ s = 1541.3 nm (signal) and λ i = 1546.1 nm (idler). The pair creation probability per pump pulse is kept at p pair ≈ 2.7 × 10 −3 in order to minimize the impact of double-pair emissions. Signal and idler modes are separated after their generation by a polarizing beam splitter (PBS) and coupled into single-mode fibers (SMF). By spectrally filtering the heralding idler photon using a dense wavelength division multiplexer (DWDM) we ensure high-purity heralded signal photons.
Detection of one heralding photon by an InGaAs single-photon avalanche diode (SPAD -ID Quantique ID210) in gated mode with a detection efficiency of around 20 % heralds the presence of a fiber-coupled signal photon with a heralding efficiency of around 75 %. The heralded signal photon first encounters a band-pass filter (BPF) with a passband between 1528 nm and 1565 nm in order to further remove residual pump light and is subsequently sent to a cascade of 50/50 fiber beam splitters (BS) where it is delocalized to generate the targeted multipartite path-entangled state. In this manner, we herald the entangled state at a rate of 11.5 kcps where 0.6 kcps are attributed to dark counts, which effectively adds loss to the state.
To generate the coherent state with the same spectral and temporal properties as the signal photon for the displacement-based measurement (see Appendix E), we stimulate a difference frequency generation (DFG) process in a type-II quasi phase-matched periodically poled lithium niobate (PPLN) nonlinear crystal [32]. To this end, the crystal is pumped by the same laser pulses as the HSPS and seeded with pulses at the same repetition rate originating from a distributed feedback (DFB) laser at λ i = 1546.1 nm. The seed laser is driven from well below to above the lasing threshold each cycle to phase randomize the coherent state and in order to reach the required displacement amplitude, we amplify the pulses with an Erbium doped fiber amplifier (EDFA). We then couple the coherent state into SMF and further filter out residual pump light with a BPF before adjusting the time delay between the coherent state and the signal photon with a motorized delay line. Moreover, to avoid saturation of the detectors (see below), we select the coherent state pulses by passing them through an electro-optic (amplitude) modulator (EOM) with an extinction ratio of ∼ 30 dB triggered by a 5 ns gate upon successful detection of a heralding photon. Residual seed laser light is filtered with a DWDM at λ s . The coherent state is then sent into the second  and calculated separable bounds w max bisep of the N -partite entanglement witness for states with vacuum contributions p0 and upper bounds p * on the probability of having more than one photon, see Ineq.
(2). The p-value for the null-hypothesis that the state ρ is biseparable is calculated according to Appendix D.
port of the first 50/50 BS where fiber polarization controllers (PC) are used to ensure orthogonal polarizations between coherent and single-photon states. The subsequent co-propagation passively guarantees stability of the relative optical phase between the coherent and single-photon states.
In order to perform the local displacement operations in an all-fiber configuration, the coherent and single-photon states are projected onto the same polarization mode using an in-line polarizer (ILP) preceded by a manual PC and a three-segment electronic polarization controller (EPC -Phoenix Photonics PSC). The first segment of the EPC allows for the control of the relative phase between the coherent and single-photon states (see Appendix E), whereas the second and third segments are used to control the displacement amplitude. In each spatial mode, the coherent state is set to have a mean photon number per pulse of roughly 13 before the polarizer to achieve a displacement amplitude of α ≈ 0.83. Drifts and fluctuations in the displacement amplitudes during the data acquisition are taken into account for the evaluation of the witness (see Appendix E). The photons are detected by eight in-house-developed MoSi superconducting nanowire single-photon detectors (SNSPD) with detection efficiencies between 75 −82 % [33]. Time-correlated single-photon counting (TCSPC) using two clock-synchronized programmable time-todigital converters (ID Quantique ID900) is then used to register detections conditioned on a successful heralding event.
After the alignment of the relative phases between the output modes, data for the witness is acquired in 20 sequences of 5 min. Each sequence measures the displacement amplitudes (1 min) and then ρ with and without displacement (2 min each). Shutters in the corresponding paths (see Fig. 2) are used to switch between the measurements. In order to estimate the probability of having more than one photon locally, we additionally perform a heralded autocorrelation measurement on one output mode by inserting a 50/50 BS before the detectors and acquire data for 6 h.

Results
The witness is measured for two different experimental configurations with N = 4 and 8 parties and the separable bound is violated in both cases, as shown in Tab. 1. In the case of N = 8 we have more loss on the state, mainly due to the insertion loss of another BS and the lower detection efficiencies of the four additional SNSPDs.
For N = 8 we further analyze the data by considering all N n=2 ( N n ) = 247 possible subsets of n ∈ {2, ..., N } out of N parties and calculating for each subset the expectation value and separable bound of the n-partite witness. The results are presented in Fig. 3. It is expected for our state that all subsets of parties show GME, however, these results suggest that also for a high probability of having vacuum for all parties p 0 , our witness is suitable to detect GME. We attribute the fact that the witness violation w exp ρ − w max bisep varies for different choices of the same number of parties n to the difference in transmission and detection efficiencies for different parties (see Appendix E).

Discussion
Let us discuss the scalability of our witness with the number of parties N . First, we emphasize that our witness only relies on three measured quantities. Two of them, the measurement of p 0 and p * , can be obtained with a single setting per party corresponding to no displacement. The last quantity is assessed with a second setting using displacement operations. This is in contrast with other methods where the overall number of settings grows polynomially with N , and in sharp contrast with techniques relying on state tomography where it grows exponentially with N . The as a function of the number of parties N for a state ρ = (|1 1| + p|2 2|)/(1 + p) input to a N -port beam splitter after undergoing a loss channel with transmission η = 0.3. The two-photon probability p = 5 × 10 −3 is a good approximation for the state generated in the experiment. In this calculation, we assume a perfectly balanced state and fix the displacement amplitude for the measurement to α = √ ln 2 ≈ 0.83 for each party, which is the most robust to fluctuations in α.
second aspect is the computational resources required to compute the biseparable bound. In our case, we only need to compute the maximum eigenvalue of an N × N matrix for each bipartition of the N parties in two groups. For any bipartition, the computational complexity of constructing the matrix and computing its maximum eigenvalue scales much better than the methods relying on a relaxation of a semi-definite optimization over the biseparable states of N qubits proposed earlier, e.g. [26]. Finally, the number of bipartitions to check grows exponentially 2 N −1 − 1 if all the amplitudes of displacement operations are different, however, this is reduced to N/2 if all the displacement amplitudes can be assumed to be equal α i = α j . In our experimental implementation, this could be achieved by actively stabilizing the power of the coherent state before inserting it into the first beamsplitter. Further, active control of the EPCs as part of the local measurement setup would suppress drifts in the displacement amplitudes.
In the experiment, the main limiting factor for the demonstration of GME in large systems using the presented witness are contributions to the state with more than one photon in total, which we upper bound by p * . In order to estimate the maximum number of parties N for which the witness still applies, we calculate the expected witness violation for a state ρ = (|1 1| + p|2 2|)/(1 + p) that undergoes a loss channel with transmission η = 0.3 and is then equally split into N modes, which is a good approximation of the state created in the experiment. As shown in Fig. 4, we see that the reduction of the probability of generating a two-photon state increases the number of parties for which our witness is able to detect GME. For a state similar to the one in the experiment, this would allow for the demonstration of GME for up to 23 parties. Furthermore, we show in Appendix F that our witness can be directly used in the presence of dark counts if one adds a term to the biseparable bound. We check that this does not affect the demonstration of GME in our experiment and further investigate the scalability under the presence of dark counts.

Conclusion
The developed witness is well suited for efficient certification of multipartite single-photon path entanglement in future quantum networks. Highly entangled multipartite states could be distributed at high rates in a scheme where each party holds a photon-pair source and one photon of each pair is sent to a central multi-port beam splitter that erases the which-path information. In this way, local losses can be kept low and the added distance between parties only reduces the heralding rate. In combination with quantum memories, such a scheme has potential for applications relying on distributed W states. The experimental challenge in such a scheme, however, remains the need for phase stability in long fiber links.

A Non-unit detection efficiency
We considered that the measurements are realized with non-photon-number-resolving detectors preceded by displacement operations in phase space. As explained in the main text, we assign the outcome +1 to a no-detection and −1 to a conclusive detection event. Given a state ρ in a single bosonic mode with associated annihilation operator a and creation operator a † , the probability P nc to get an outcome +1 using a displacement with argument α, is given by In the case where the detector has a finite efficiency, we can model the detector inefficiency with a beam splitter having a transmission η, that is −c  † a) for η = cos 2 (ϕ), and the auxiliary mode described by c and c † , being initially empty.
where the displacement on mode c has been traced out. This means that we can model the detection inefficiency as loss operating on the measured state if the amplitude of the displacement operation is changed accordingly. Hence, the fact that we consider detectors with unit efficiencies is still a valid description of our measurement apparatus, where we do not need any assumptions on our state nor on the efficiency of our detectors.

B Genuine multipartite entanglement witness
Here we show how the calculation of the biseparable bound of our witness can be reduced to an single parameter optimization of the maximum eigenvalue of a N × N -matrix. We start with the witness operator as presented in Eq. (7) where M n≤1 is an operator in the sector with not more than one photon and E n≥2 ≥ 0 is in the sector with more than two photons. As argued in the main text, all our observables are block diagonal with respect to the total number of photons. In particular, To find the biseparable bound consider biseparable states of the form defined over a bipartition given by two disjoint subsets G 1 ∪G 2 = {1, . . . , N } of the N modes. The value that the witness takes over these states reads As µ Ψ|E n≥2 |Ψ is positive, to maximize the righthand side one can restrict the consideration to states |Ψ 1 (2) with one photon at most. Hence, without loss of generality we take To compute the value of the witness on these states we need the explicit form of the operatorsÔ n≤1 α and M n≤1 . To computeÔ n≤1 α we use the restriction of σ α to the subspace with not more than one photon where we assumed a real α and denoted see e.g. [30] for a derivation. For the product observable restricted to the subspace of interest we obtain remark that here and in the following σ αj refers to the phase-averaged operator. Adding the identity on the remaining modes gives with |0 denoting the vacuum state of all involved modes and |1 k = a † k |0 denoting the state with one photon in mode k and vacuum elsewhere, and For M n≤1 we chose with a positive real parameter λ that one can tune, such that Finally for E n≥2 we take the operators corresponding to the probability to find photons in more than one mode. We rewrite v where |1 j stands for one photon in mode j and vacuum in all the other modes. We thus have by Eq. (23) where L = cas b w sac b v . We rewrite Eq. (35) in a matrix form By arranging the entries of the matrix according to the defined bipartition, i.e. the block spanning the first |G 2 | rows and columns describes the modes in G 2 , we can explicitly write down the matrix We finally can write = max eig (M(λ, µ, α, a)) , where in the last line "max eig" denotes the maximum eigenvalue of the Hermitian matrix M (λ, µ, α, a). The biseparable bound is thus given by the optimization max eig (M(λ, µ, α, a)) (39) for well-chosen λ and µ. One notes that the optimization can be restricted to a ∈ [0, π 2 ], as the transformation (c 2 a , s 2 a , c a s a ) → (c 2 a , s 2 a , −c a s a ) only changes the sign of the off-diagonal blocks of the matrix M → 1 −1 M 1 −1 which does not change its spectrum, as 1 −1 is an orthogonal matrix (basis change).

C Measuring the witness
In this section, we explain how we estimate the expected value of the witnesŝ on the state ρ prepared in the experiment. It can be estimated from two observables, the one with displacementÔ α and one without M n≤1 − N (N − 1)Π n≥2 − µE n≥2 . In our case, given the limited number of detectors we actually use three different observables, because we split Π n≥2 in two parts. To do so we note that the probability that any N -mode state ρ contains two or more photons satisfies P n≥2 (ρ) = tr(Π n≥2 ρ) ≤ p * , where Here, P n click is the joint probability that n detectors click and all the other detectors do not click when measured without displacement and p * i is an upper bound on the probability of having two or more photons in mode i, that we associate to an observable Π (i) n≥2 . The value for p * i can be obtained in practice by measuring the probability of coincidence after a 50/50 beam splitter in mode i. On the level of the operators, we can write with two parts that we measure independently. For the witness this implieŝ It is already clear how the observablesÔ α and n≥2 can be measured, so let us now focus on the remaining terms. First, we note that because the probability that there is no photon in the state or only one photon in some mode k = i, j, given by the POVM element |0 0 |+ k =i,j |1 k 1 k |, is lower than the probability that there is no photons in the modes i and j given by |00 00| ij . We thus obtain In the experiment we do not have full information on the value of α in a particular round, but rather a range of possible values α = (α 1 , . . . , α N ) ∈ A. Therefore, the following bound will be useful To summarize we have shown that In the experiment the values of the observablesÔ, Z and Σ n≥2 are measured independently. The observable Σ n≥2 is independent of the displacement amplitudes α whileÔ α is physically determined by α. The observable Z is computed using the knowledge of the range A of possible values α ∈ A for the coefficients F ij , while the underlying physical measurement is performed without displacements.
To analyze the experimental data, we assume that the state preparation is identical in each run of the experiment, such that is the same N -mode state ρ is prepared repeatedly. On the other hand, the displacement amplitudes are subject to controlled fluctuations α ∈ A, with the possible range A determined experimentally, see Appendix E. Nevertheless, each round k can be associated to some (unknown) value α k . To prove that the state is GME it is sufficient to show that 1 n with X ρ = tr(Xρ), since each α k yields a valid GME-witness. By defining the worst-case biseparable bound and using the bound (49), we can relax the GME condition to Before analyzing the statistical significance of our data, let us briefly sketch how GME can be guaranteed in the asymptotic limit n → ∞. Then the average values of the observables Z and Σ n≥2 converge to their expected values Z and Σ n≥2 . Similarly, for the random variables o k as the value ofÔ α k observed in the round k (where it is measured), the observed average converges to the average expected value by Hoeffding's theorem (1963), see Appendix D. Hence, Eq. (54) can be directly guaranteed from the data. Note that in practice to estimate Σ n≥2 we do not measure Π (i) n≥2 for each mode. Instead, we assume that it has the same expected value for each mode, such that and only estimate Π

D Finite statistics analysis
In the experiment, three different measurements are performed, to each of which we associate a random variable. Let o k be the random variable given the value ofÔ α k observed in the round k = 1, . . . , n (when it is measured). Analogously, define z k associated to Z (for k = 1, . . . , m) and s k associated to −N 2 (N − 1)Π (1) n≥2 (for k = 1, . . . , ). Note that all the variable are independent, furthermore the variables z k and s k are also identically distributed (for each k). From the definition of the corresponding observables (their spectrum) one directly sees that from which we define For each type of observables we define the average as To analyze the statistical significance of our data, we use the following theorem by Hoeffding (1963) [31]: For any collection of independent random variables To apply to our data consider the situation where the observable o k ∈ [a, a + ∆ o ] are measured in n rounds, z k ∈ [b, b + ∆ z ] are measured in m rounds, and s k ∈ [c, c + ∆ s ] are measured in rounds, the above theorem implies Now consider any state ρ bisep that is not GME. We have shown that such a state does not violate the relaxed witness of Eq. (54). Thus, it gives rise to a collection of random variables, described in the beginning of the section, with Then, by Eq. (65) the probability that the observed averages satisfy i.e. a fake violation exceeding t is observed due to statistical fluctuation, is upper bounded by Hence,   . (69). Note that in order to obtain a p-value of less than 10 −10 , in the case of N =4 it would suffice to evaluate the observables n = m = = 4.9 × 10 5 times, corresponding to a total integration time of less than 130 s, and for N =8 it would require n = m = = 7.8 × 10 6 evaluations which could be achieved in less than 2100 s. can be interpreted as the p-value associated to our GME test. That is, p is an upper bound on the probability that a state ρ bisep which is not GME produces a fake violation of o + z + s − w max bisep or higher. For the corresponding values in the experiment, see Tab. 2.

E Experimental methods and characterization
To ensure high-purity heralded signal photons, we spectrally filter the heralding idler photons emitted from the PPKTP crystal by using a dense wavelength division multiplexer (DWDM) with a 100 GHz passband at λ i = 1546.12 nm (ITU channel 39).
Besides high single-photon purity, the single-photon and coherent states need to have a good spectral and temporal overlap in order to perform the targeted displacement operation. The measurement of the spectral overlap is shown in Fig. 5. The expected Hong-Ou-Mandel visibility due to the finite overlap of the fitted Gaussians is 99.2 % [34].
The temporal alignment between the single-photon and coherent states is done in the following way. The seed laser is switched to continuous mode and, together with the pulsed pump laser, difference frequency generation in both nonlinear crystals, PPKTP    and PPLN, is used to generate coherent states at λ s . First-order interference is observed at one of the output modes and the interference visibility is maximized by adjusting the delay of the motorized delay line.
In order to set the relative phases between singlephoton and coherent states in each spatial mode, polarization is aligned with a polarization controller such that it enters the first segment of the electronic polarization controller on-axis which therefore allows for relative phase control locally, as shown in Fig. 6. The phase is then set in each mode such that its relative phase to the reference mode is zero.
After the phase alignment, data is acquired in intervals, as described in the main text. The displacement amplitudes are obtained by measuring the coherent state in the absence of the single-photon state and assuming Poissonian count statistics (see Fig. 7). For the calculation of the expectation value of the witness (see Eqs.  Fig. 7.
To determine the balance of the generated state, the counts in each mode in the measurement without displacement are normalized, as shown in Fig. 8. For the measurement where N = 4, the mode balance is 25.0 ± 1.5%, where for N = 8 a balance of 12.5 ± 0.8% is achieved.
The values for the probabilities P n click that n de-tectors click when measuring the state are given in Tab. 3. Note that the probability p 0 indicated in Tab. 1 are p 0 = P 0 click = 1 − n≥1 P n click .

F The effect of dark counts on the witness
Here we explain how the presence of detector dark counts can be in included in our GME witness. We will show that by adding a constant term 2N 2 (N − 1)p dc to the biseparable bound, the violation of this shifted witness (exactly as described in the main text, but with detectors subject to dark counts) allows one to conclude that the measured state is GME. Before we start, recall that a usual model of dark counts for non-photon-number-resolving (NPNR) detectors is a classical noise which changes the outcome "no-click" to "click" with probability p dc . Thus, "turning on" the dark counts on a detector modifies the click/noclick probabilities to The starting point is to consider an experiment where the expected value of the witness Ŵ dc is estimated as described in the main text, but with detectors subject to dark counts. We now introduce a simple physical model that reproduces (almost) all the statistics of this experiment, but involves detectors without dark counts. To this end, consider a singlemode quantum channel S dc which does nothing with probability 1 − p dc and replaces a single mode state with a very bright Fock state |M with probability p dc The state S dc [ ] is then measured with the measurement described in the main text. We now have to distinguish between different measurements that we treat separately. (1) The measurements of the witness and (2) the estimation of p * . More precisely, we need to distinguish measurements with one detector per mode and g (2) measurements with two detectors per mode.
In the case (1) all measurements involve a single lossy NPNR detector per mode, sometimes preceded by a displacement D(α). The probability distribution of the outcomes of this measurement is a mixture of two possibilities. With probability 1 − p dc the state was unchanged and the measurement is performed on the original state ρ leading to the click/no-click probabilities (p 0 , p c ). With probability p dc the measurement is performed on the state |M , where we can always choose M large enough such that in this case a "click" outcome is observed with certainty p c = 1.
The overall outcome probabilities are thus given by Hence, the N -mode stateρ = S ⊗N dc [ρ] measured with detectors without dark-counts reproduces the statistics of the N -mode state ρ observed with detectors subject to dark counts whenever only one detector is used per mode.
Let us now consider the estimation of p * , which is an upper bound on the probability of having two or more photons in the state. As defined in Eq. (41), p * is composed of two contributions. The first term N n=2 P n click is the probability to observe clicks on more than two modes gathered with a single detector per mode. Hence, for this term the above argumentation holds P n click+dc [ρ] = P n click [ρ]. The other term p * i is an upper bound on the probability of having two or more photons in a single mode. This is measured in a g (2) experiment -a single mode state is split on a 50/50 beam splitter, and each output is sent to a NPNR detector. The probability that the two detectors click p cc is bounded by the probability that contains two or more photons p cc ≤ p * i ≤ 2p cc . And it is precisely the estimated p cc , which is used to bound p * . Let us now analyze how this probability is affected by dark counts. For a state we have p dc cc = p cc + p dc (p 0c + p c0 ) + p 2 dc p 00 = p cc + p dc (p 0c + p c0 + p 00 ) − p dc p 00 + p 2 dc p 00 = p cc + p dc (1 − p cc ) − p 00 p dc (1 − p dc ). (73) Now let us analyze the effect of the channel S dc on this probability. For a state˜ = S dc [ ] one has Hence, for the stateρ we get an upper bound where p * i is the quantity estimated in the experiment with dark counts. Combining the above arguments and using Eq. (41) it follows that p * ≤ p dc * + 2N p dc (76) with p dc * the p * estimated in the experiment with dark counts, is a valid upper bound on two (and more) photon contributions in the stateρ = S ⊗N cd [ρ].
where we used Eq. (76). Here, w max+dc bisep = w bisep + p dc * N (N − 1) is the biseparable bound estimated in the real experiment (state ρ and dark counts). We can thus conclude that observing implies that the state S ⊗N dc [ρ] is GME by our main result. Since the channel S ⊗N dc describes noise acting locally on each mode and cannot create entanglement, the N -mode state ρ is also GME. This concludes our argument showing that if the detectors used in the experiment suffer from dark counts (that we did not include in their mode), the procedure described in the main text still allows to prove the GME of the measured state, but the observed violation Ŵ dc − w max+dc bisep has to exceed 2N 2 (N − 1)p dc . Note that since Ŵ dc − w max+dc bisep scales as N 2 in general, the penalty terms accounting for dark-count scales as 2N p dc .
For the analysis of the scalability of the presented witness including detector dark counts, we consider a heralded single photon generated by a SPDC source, which state after heralding can be well approximated by Since losses on the state commute with the beam splitter one can directly apply losses on ρ to account for finite efficiency and obtain ρ η , which transforms to ρ BS η after the N -mode beam splitters. The idea in the following is to only consider the reduced density matrix to two modes σ {1,2} = tr {3,...,N } (ρ BS η ), Since we consider a perfectly balanced beam splitter, we can evaluate almost all the terms in Eq. (77) only using S ⊗2 dc [σ 1,2 ], where the only assumptions we make is that the 3-click events are always negligible over the 2-click events, which holds true for our state.
In Fig. 9, we show the maximum number of parties N max , for which the value Ŵ dc −w max+dc bisep −2N 2 (N − 1)p dc is still positive, as a function of η for a state ρ BS η . In the calculation, the displacement amplitude for the measurement is set to α = √ ln 2 ≈ 0.83 for each party, which is experimentally the most robust to fluctuations in α. In the experiment, the probability of heralding a two-photon Fock state is p ≈ 4.9 × 10 −3 and the detector with the highest dark count rate has a dark count probability per heralding event of p dc = 1.16(29) × 10 −6 , which is approximated in the plot by the line (p, p dc ) = (5 × 10 −3 , 10 −6 ). We notice that in this case for η > 0.05, we are able to detect GME for more than N = 17 parties. We further note that for higher dark count probabilities N max decreases, and in the case of p = 0 the influence of dark counts can be substantial. The witness could be improved by using a different detector model including dark counts, which we leave for further work.