Classical models may be a better explanation of the Jiuzhang 1.0 Gaussian Boson Sampler than its targeted squeezed light model

configurations of Jiuzhang 2.0, the test indicates that the experimental samples have higher ground truth probability than the samples obtained from our alternative distribution; for Jiuzhang 1.0 the test is inconclusive. Our results provide a new hypothesis that should be considered in the validation of future GBS experiments, and shed light into the need to identify proper metrics to verify quantum advantage in the context of threshold GBS. Additionally, they indicate that a classical explanation of the Jiuzhang 1.0 ex-periment, lacking any quantum features, has not been ruled out.


Introduction
One of the most exciting frontiers within the field of quantum computing is the topic of quantum advantage, which aims to design experiments that demonstrate the ability of current quantum devices to significantly outperform classical computers at a well defined computational task [3,4]. The theoretical design and real-world implementation of such experiments has been the focus of significant effort in the superconducting circuit [5,6] and quantum photonics [1,2,7] communities in the form of Random Circuit Sampling (RCS) [8,9] and Gaussian Boson Sampling (GBS) [10,11,12,13], respectively.
GBS consists in sending a set of input squeezed states into an interferometer and measuring its output using photon-number [10,11] or threshold [14] detectors. The former directly sample the photon number distribution while the later sample binary patterns of clicks that indicate whether light has been detected or not. It has been shown that sampling from the theoretical probability distribution of the resulting detection patterns, i.e., the ground truth distribution of the ideal experiment, is a computationally hard task [10,12,13]. These results have important caveats, for example, the known proofs of hardness for GBS require the photon number density (the average number of photons per mode) to be small so that the probability of two or more photons being measured in the same detector is very small. On the algorithmic side, the best known methods to classically simulate these quantum sampling problems scale exponentially in the number of detected photons or counted clicks [15,16,17,18,19,20].
Verifying that quantum samplers are operating correctly remains an active area of research (cf. the re-view paper by Hangleiter and Eisert [4] and references therein). For RCS it was identified early on that an estimate of the cross entropy between the samples generated by the physical device and the probability distribution associated with the ideal computation served as a witness of quantum advantage [8,9]. For GBS the situation is more complex, as the development of a proper figure of merit that allows to readily verify a claim of quantum computational advantage is still an open challenge [4,12]. On this account, the validation of GBS usually relies on a series of tests that rule out possible classical hypotheses or that compare the quality of the samples generated by the quantum machine against samples generated by classically efficient methods.
Two recent landmark threshold GBS experiments by Zhong et al. using 100-mode [1] and 144-mode [2] interferometers (named Jiuzhang 1.0 and 2.0, respectively) claim to achieve quantum computational advantage. The authors use three different validation tests: the comparison of truncated first to fourth order correlation functions (i.e., the first to fourth order click cumulants of the probability distribution), a Bayesian test and the Heavy Output Generation (HOG) test. The first examines how well the correlations in the observed data match the correlations predicted by the squeezed state hypothesis. The second test compares how good the squeezed state hypothesis is at explaining the observed data relative to other hypotheses such as thermal states, coherent states, distinguishable squeezed states and uniform probability distributions. The third test looks at how well the samples generated by the experiment have "heavy outputs" (i.e., correspond to events with high probability) in the ideal distribution relative to samples generated by classically efficient methods. These classically efficient methods can be physically motivated as considered by Zhong et al. [1,2] but need not be [21].
These validation tests intend to build up confidence in the correct functioning of the boson sampler. However, they have several limitations and by no means represent a complete measure to readily tell if a device achieves quantum advantage. For instance, the Bayesian and HOG tests rely on the computation of probabilities of a great number of detection patterns, a task that increases its complexity exponentially with the number of detected photons (clicks) present in the patterns. Consequently, they can only be used in a small region of the photon (click) number distribution.
In this work, we propose an alternative classical hypothesis for the validation of the experiments of Zhong et al. Our hypothesis is based on the probability distribution that is obtained from using classical mixtures of coherent states (resulting in Gaussian states that we term squashed states [22,23]) as inputs of the inter-ferometers in GBS setups. These states are classical, possessing a non-negative Glauber-Sudarshan P function [24,25,26,27] and upon interacting on an interferometer generate fully separable (i.e. having zero entanglement) multimode states (by virtue of having a positive multimode Glauber-Sudarshan P function).
To compare the squeezed states and squashed states hypotheses, we investigate two of the tests (correlation functions and Bayesian test) that the authors of the experiment employed. We find that for configurations in the high photon number density regime (which correspond to the Jiuzhang 1.0 experiment and the two brightest configurations of Jiuzhang 2.0), on which most of the quantum computational advantage claim relies, the truncated correlation functions predicted by the squashed states distributions are as consistent with the experimental results as those predicted by the ground truth of the experiment given by pure squeezed states. On the other hand, the Bayesian test shows that the ground truth of the experiment is more likely to describe the experimental samples of Jiuzhang 2.0 than the squashed states hypothesis. For Jiuzhang 1.0, the test indicates that the squashed states hypothesis is more likely. This behaviour of the Bayesian test is consistent with the improvement of the light source of the Jiuzhang 2.0 experiment with respect to Jiuzhang 1.0.
These results demonstrate that, contrary to what the authors of the experiment suggested, the comparison of correlation functions is not sufficiently reliable for the verification of quantum advantage claims. Additionally, they indicate that the squashed states hypothesis presents itself as viable classical hypothesis for the validation of current threshold GBS experiments that should be weighted in future experiments.
After considering hypothesis testing, we generate samples from our squashed states hypothesis (which can be done in polynomial time and space in the number of modes/clicks [26,22,28]) and compare against the experimental samples from the Jiuzhang experiments, to see if, like the Boltzmann machine and greedy methods from Villalonga et al. [21], we can spoof the HOG test. This test compares the probabilities of the experimental samples and the samples from the squashed states with respect to the ground truth. Surprisingly, for the Jiuzhang 1.0 experiment, the test is inconclusive. For the Jiuzhang 2.0 experiment, on the other hand, we find that the squashed states samples are unable to spoof the HOG test.
Besides our analysis of the experiments of Zhong et al., there have been other works that investigate classical ways to simulate some properties of the experimental data of the Jiuzhang experiments, or that directly aim to produce classical samples that perform better than the experimental patterns at different statistical tests.
The first of them, by Drummond et al. [29], proposes a method to classically simulate grouped click probability distributions of general GBS setups. The authors use this method to simulate different grouped click distributions of the Jiuzhang 1.0 experimental data. Moreover, they propose a model that combines input squeezed thermal states and a modification of the intensity transmissivity of the interferometer, in order to take into account several experimental sources of decoherence, such as phase noise or the presence of longitudinal modes with mismatches in time or frequency. The authors find that their decoherence model has a better fit to the experimental results than the ground truth of the experiment. The second of these works, by Villalonga et al. [21], proposes a classical algorithm, based on marginal distributions of the ground truth, to generate samples that have better total variation distance and Kullback-Leibler divergence with respect to the ground truth than the experimental samples of the Jiuzhang experiments. Together with our results, these works suggest that a classical explanation of the experiments of Zhong

Ground truth and squashed states distributions
A threshold GBS experiment has three stages: preparation of K (single-mode) squeezed states, evolution in an interferometer with M ≥ K output modes, and sampling of the output statistics using threshold detectors. These detectors do not resolve the incoming number of photons, they can only indicate if light has arrived to the detector or not. Therefore, the outcomes of the measurement can be expressed as M -bit strings, patterns consisting only of zeros and ones, where one indicates that a detector has been triggered (the detector has 'clicked') and zero indicates that no light has been detected. Jiuzhang 1.0 (2.0) uses 25 two-mode squeezed states (TMSS), corresponding to 50 squeezed states, as inputs to a 100-mode (144-mode) interferometer for the first (second) setup. According to Zhong et al. [2], the light source of Jiuzhang 2.0 is greatly improved with respect to that of Jiuzhang 1.0, in the sense that its quantum state is closer to the theoretical model (an affirmation that is supported by recent theoretical studies [30]), and allows the implementation of seven different configurations by varying its power and focus waist.
The specification of the different experimental configurations, along with the theoretical photon number density, ν =N /M (the ratio between the mean photon numberN and the number of output modes), mean number of clicks,C, and standard deviation of the number of clicks, σ(C), of the Jiuzhang 1.0 and Jiuzhang 2.0 experiments are shown in Table 1. The values ofC and σ(C) were computed using the methods of Refs. [13,28].
The ground truth of the Jiuzhang 1.0 and Jiuzhang 2.0 experiments is described as 25 input two-mode squeezed states that interfere in a linear lossy interferometer. The output state is then measured using ideal threshold detectors. The interferometer in this model contains the information about the losses in the real experiment (collection efficiency, propagation loss, and finite detector efficiency).
Two-mode squeezed states are mathematically defined as where |0 k , 0 l ⟩ is the vacuum state of modes k and l, a k , a l and a † k , a † l are the annihilation and creation operators of these modes, and ζ = re iϕ is a complex squeezing parameter. The annihilation and creation operators of the modes satisfy the usual bosonic canonical commutation relations [a i , a j ] = [a † i , a † j ] = 0 and [a i , a † j ] = δ i,j . The state |ζ kl ⟩ can also be expressed in the form [31,32] where U BS = exp[ π 4 (a † l a k − a † k a l )] is a unitary transformation representing the action of a 50 : 50 real beamsplitter 1 , and S k (ζ) = exp[− 1 2 (ζa †2 k − ζ * a 2 k )] is the single-mode squeezing operator of mode k. This expression indicates that a TMSS can be physically generated by interfering two single-mode squeezed states (SMSS) compressed along orthogonal directions into a 50 : 50 real beamsplitter.
For Jiuzhang 1.0 and Jiuzhang 2.0, the squeezing parameters of the input states ζ i = r i , i = 1, · · · , 25, are considered to be real and positive. The information about the phases ϕ i is absorbed into the rectangular sub-unitary matrix T that describes the action of the lossy interferometer. As shown in Eqs. (8) and (9), this matrix allows the calculation of the output state for Gaussian input states. The size of T is M × 50, where M is the number of output modes. Data about the squeezing parameters and the matrix T is available in [33] for Jiuzhang 1.0, and in [34] for Jiuzhang 2.0. 1 We use the term real beamsplitter to indicate that the coefficients relating a l (a † l ) and a k (a † k ) in U BS are all real.     In order to verify the results of the Jiuzhang experiments, we need to determine the theoretical probability distribution of the experimental samples, which is commonly known as the ground truth distribution of the experiment. The specification of this distribution will also allow us to motivate the definition of the probability distribution of the squashed states.
The ground truth distributions of the Jiuzhang exper-iments are completely characterized by the output state of the interferometer. Since the input states are Gaussian, and the interferometer is linear, the output state is also Gaussian. Gaussian states of K modes are characterized by their vector of first moments and their covariance matrix. If we write the annihilation operators as a k = (x k +ip k )/ √ 2ℏ, in terms of the hermitian in-(out-) phase quadratures x k (p k ), we can define the vector r T = (x 1 , x 2 , · · · , x K , p 1 , p 2 , · · · , p K ). Then, assuming a vanishing vector of first moments r = Tr(ρ G r) = 0, (as is the case for the Jiuzhang experiments) the covariance matrix can be written as σ = 1 2 Tr ρ G {r, r T } . Here, ρ G denotes the density matrix of the Gaussian state and {A, B} = AB + BA stands for the anticommutator of operators A and B. On this account, the ground truth distribution of the Jiuzhang experiments will be completely specified by the covariance matrix of the output states. In what follows, we will explain how to compute this matrix.
We first specify the covariance matrix of the input states. To do this, we follow the procedure indicated by Eq. 2: we generate 25 TMSS by interfering 50 pairs of SMSS with squeezing parameters {−r 1 , r 1 , . . . , −r 25 , r 25 } into 25 beamsplitters that act on consecutive modes. The covariance matrix of one SMSS with squeezing parameter r is Note that in this covariance matrix the variances of the two quadratures are inversely related, thus saturating the uncertainty relation [31].
We can now construct the 100×100 covariance matrix of the 50 SMSS (in the ordering indicated by r T ) as where diag(v 1 , . . . , v m ) forms a diagonal matrix of size m with entries v 1 , . . . , v m . A 50 : 50 real beamsplitter interfering two adjacent modes (see Fig. 1a) is represented by the matrix The corresponding matrix for the 25 beamsplitters has the form The covariance matrix of the input TMSS can then be computed as Given a K-mode input Gaussian state with covariance matrix σ IN , an interferometer channel with transmission matrix T maps it to an M -mode Gaussian state with covariance matrix σ OUT given by [31] where The complex transmission matrix in the last equation is generally rectangular and of dimensions M × K. With this notation in mind, we can then simply write the covariance matrix of the squeezed ground truth hypothesis as σ (SQUE) = L T (σ TMSS ). Having the covariance matrix σ of a given Gaussian state we can calculate click-probabilities as [14] Pr Here, Tor(A) is the Torontonian of matrix A (see Appendix A for a definition). The operation A (s) is explained as follows: suppose that the detection pattern s contains C clicks (C ones) observed in the modes {j 1 , . . . , j C }, then matrix A (s) is obtained by keeping only the rows and columns To obtain probabilities associated with the ground truth we simply let σ → σ (SQUE) in the last equation. Note that the results presented here can be extended to Gaussian states with displacements [35] using loop Torontonians [36].
We now turn to the definition of the squashed states distribution. When losses are incorporated in the input states, squeezed states become squeezed thermal states [22]. These states still have a quadrature with noise lower than that of the vacuum state. We define the squashed states as squeezed thermal states with vacuum fluctuations in one quadrature and larger fluctuations in the other (in Appendix C we list some properties of these states and make a comparison of their covariance matrix with that of squeezed thermal states). The output states obtained by interfering squashed states are classical, which means that we can efficiently sample from their probability distribution using classical computers. As can be seen in Fig. 1b, the noise ellipse of the squashed states suggests that they are better approximations to the squeezed states than the classical states that Zhong et al. used for the validation of the experimental results, which have circular noise ellipses. Indeed, squashed states are the classical Gaussian states with the highest fidelity to squeezed thermal states [22].
The definition of the probability distribution associated with the squashed states is now straightforward: we need to apply the interferometer channel with transmission T to 25 two-mode squashed states. In turn, the input covariance matrix to the interferometer is obtained by replacing σ TMSS for the input 25 two-mode squashed states covariance matrix, σ ′ 0 . This matrix is obtained in the same way we constructed σ TMSS : we interfere 50 single-mode squashed states into 25 real 50 : 50 beamsplitters. The covariance matrix of a single-mode squashed state with mean photon number n = sinh 2 r is This covariance matrix has no squeezing as one of the quadratures is at the vacuum level while the other has excess (classical) noise proportional ton, thus they can be described as classical mixtures of coherent states as shown in Ref. [23]. Notice that the matrix above can be obtained by replacing e −2r and e 2r in Eq. (3) by 1 and 1 + 4n, respectively. This same procedure can be used to construct the covariance matrix of the 50 singlemode squashed states, σ 0 : we replace the e −2r k and e 2r k terms in Eq. (4) by 1 and 1 + 4n k , withn k = sinh 2 r k , respectively. Then, σ ′ 0 can be written as We can write the covariance matrix associated with the squashed states hypothesis as σ (SQUA) = L T (σ ′ 0 ) . The covariance matrix of this state satisfies σ (SQUA) ≥ ℏ 2 I 2M and thus the Gaussian state associated with it can be written as a mixture of products of single-mode coherent states [23,27,26], implying that this Gaussian state is separable across any partition of its modes. Finally, having the covariance matrix of the state, we can obtain probabilities associated with it by letting σ → σ (SQUA) in Eq. (10).
It is worth mentioning that, as was proposed in Ref. [29], the transmission matrix of the interferometer, as well as the mean photon numbers of the input squashed states, can be modified in order to obtain a model that takes into account other sources of experimental decoherence (not only losses in the input squeezed states). However, we choose the squashed states to have the same mean photon numbers as the squeezed states used in the definition of the covariance matrix in Eq. (7) so as to have a photon number distribution with lower total variation distance to the distribution predicted by the ground truth. Moreover, we maintain an unmodified transmission matrix in order to propose a classical model with the closest resemblance to the ground truth hypothesis presented by the authors of the experiment.

Validation tests
The results of the Jiuzhang experiments have been validated against a number of hypotheses and adversaries. These validations generally made use of three different tests: a Bayesian test, the Heavy Output Generation (HOG) test, and the comparison of the click cumulants of the different possible distributions with those of the experimental samples.
In this section, we will compare the squashed states hypothesis and the squeezed ground truth hypothesis and we will determine which of them is better supported by the experimental data available (in the context of statistical hypothesis testing, this would tell us which hypothesis is the better explanation of the experiment). Moreover, we will use samples generated from the squashed states hypothesis to perform the HOG test 2 .

Click cumulants of the distribution
The comparison of click cumulants was first used in the validation of the Jiuzhang 2.0 experiment [2]. The authors used this method to investigate how robust the experimental samples are against classical simulation schemes based on marginal distributions. They claim that the presence of "non-trivial genuine high-order correlation in the GBS samples are evidence of robustness against possible classical simulation schemes" [2]. These correlation functions are given by cumulants [37,38,39] (also called Ursell [40] functions, truncated correlation functions [41] or cluster functions [41]) defined in terms of moments of a multidimensional random variable X = (X 1 , X 2 , . . . , X M ) as where π runs through the list of all partitions of {1, ..., n}, B runs through the list of all blocks of the partition π, and |π| is the number of parts in the partition. Note that the first order cumulants are simply the means κ(X i ) = ⟨X i ⟩ and that the second order cumulants are the covariances κ( Since the probability distribution associated with a threshold detector experiment has binary outcomes 0, 1 in each mode, it is straightforward to see that moments of the distribution correspond to marginal probabilities The latter in turn can be computed by constructing the marginal covariance matrix of the modes i 1 , . . . , i n and using Eq. (10). Following Ref. [2], we compute the cumulants up to fourth order of the squashed states distribution of the different setups of the Jiuzhang experiment and compare them with those of the experimental samples. We also make the corresponding comparison with the ground truth distributions. For these computations, we used 10 7 of the ∼ 5 × 10 7 samples available in Ref. [33]. Note that the number of cumulants increases sharply with the order; for a system with M modes there are M !/(ℓ!(M − ℓ)!) possible combinations of ℓ different modes (without repetitions), which is precisely the number of cumulants of order ℓ. Considering this fact, we randomly select 10 5 sets of three and four different modes for the computation of the third and fourth order cumulants, respectively. We compute all the possible cumulants of first and second order. Fig. 2 shows the results for Jiuzhang 1.0 (ν = 0.786) and for two configurations of Jiuzhang 2.0 (ν = 0.044 and ν = 0.975). Fig. 3 shows the Pearson and Spearman correlation coefficients between experimental and theoretical cumulants as functions of ν (for all the configurations of Jiuzhang 1.0/2.0). The corresponding pvalues, computed using the function cor.test from the R stats package [42], are lower than machine_epsilon= 2.2 × 10 −16 for every configuration, indicating a negligible probability of obtaining these correlation coefficients using uncorrelated data sets (i.e. a negligible probability of obtaining these correlation coefficients by chance). The same computation using the pearsonr and spearmanr functions from the SciPy stats (scipy.stats) module [43], results in p-values that are effectively zero for all ν.
It is clear that for low densities, as exemplified by the ν = 0.044 configuration in Fig. 2, the cumulants predicted by the squashed states distribution are not consistent with the experimental results. In this case we can readily admit the ground truth of the experiment as the better hypothesis. This observation also holds for configurations with photon number densities between ν = 0.055 and ν = 0.218 as seen by looking at the Pearson and Spearman correlation coefficients in Fig. 3. However, we note that by increasing ν, the cumulants of the squashed states distribution become progressively consistent with the experimental samples. Indeed, for setups with ν = 0.786 and ν = 0.975 (also for ν = 0.442), the experimental results are as consistent with the squashed states distribution as they are with the ground truth distribution of the experiment.
In these cases, the comparison of cumulants does not allow to directly determine which of these two distributions better describes the experiment. This indicates that, for configurations with high photon number density, the presence of cumulants (up to fourth order) in the experimental samples can be efficiently reproduced by a classical hypothesis, which implies that the idea of having these correlations in the data does not necessarily shield the experiment from classical simulations. Moreover, this observation suggests that the comparison of click correlations between modes is not a completely reliable metric for the validation of GBS experiments.
This result is of particular importance even if it does not hold for all ν. From Table 1 (and also from Fig. 6 in Appendix B) it can be inferred that configurations with low ν have click number distributions that are mostly located below 40 clicks, that is, these configurations mostly generate samples with less than 40 clicks. This upper limit in the number of clicks makes the simulation of these setups a feasible task for a classical computer [14,44,15]. It is the configurations of the Jiuzhang experiments with high photon number density that become increasingly difficult to simulate. The comparison of click cumulants shown here indicates that a simulation scheme using the squashed states hypothesis efficiently reproduces the results of the experiment in this regime.
Although, in principle, the comparison between theoretical and experimental cumulants could be carried out up to an arbitrary order, there is an important limitation that does not allow the direct comparison of correlations of fifth order or higher. As the authors of the experiment already noted (see Supplemental Material of Ref. [2]), experimental correlations of order equal or higher than five become increasingly difficult to distinguish from statistical noise. This is mostly due to computing cumulants with an insufficient number of experimental samples. However, the number of samples required to distinguish correlations from noise sharply increases with the order, which in turn, implies an increase of the time and memory needed for the computation of cumulants.
On account of this limitation, the authors of the experiment do not directly compare theoretical and experimental correlations of higher order. Instead, they use a statistical analysis to estimate the p-values of the Pearson correlation coefficients (between ground truth and experimental cumulants) up to 7th-order. Then, they extrapolate the estimated seed curve to compute p-values for higher orders. This allows the authors to claim that the p-values have statistical significance (i.e. p < 0.05) up to order (19 ± 1), which implies the presence of 19th-order correlations in the experimental data  (details of this statistical analysis can be found in the Supplemental Material of Ref. [2]). We are not convinced that this procedure allows to truly verify if theoretical and experimental correlations are compatible for higher orders. Consequently, we do not carry out the same statistical analysis using the squashed states distribution. We explain the compatibility between the squashed states and experimental cumulants for high photonnumber density ν by noticing that the Jiuzhang experiments use input states with different squeezing parameters and threshold detectors. For a general GBS setup, it can be shown [13] that the expected mean number of clicks,C, when using threshold detectors follows the relation whereN is the expected mean number of photons and M is the number of output modes. In terms of the photon number density ν =N /M we havē As can be seen from the equation above,N ∼C for ν ≪ 1. This suggests that in this regime we can interpret each click detected as corresponding to approximately one single photon. On the other hand, for increasing ν,N >C, implying that a single click corresponds to more than one photon (there is an increasing number of collisions). These observations suggest that in the low ν regime we can interpret a measurement using threshold detectors as an approximate photon number resolving (PNR) measurement. For increasing ν we can no longer state that the use of threshold detectors is similar to the use of PNR detectors. When the squeezing parameters of the input states are all the same, a PNR measurement can readily distinguish between squeezed and squashed states (this can be checked by comparing their second order cumulants, as has been done in a recent experiment by Madsen et al. [7]). This does not necessarily hold when the squeezing parameters are different, ν is high or threshold detectors are used.

Bayesian test
The Bayesian test that was used in the validation of the Jiuzhang 1.0/2.0 experiments compares two differ-  ent hypotheses regarding the true probability distribution of the experimental samples [45]. This test gives the degree of confidence of one hypothesis over another. In this case, we compute the degree of confidence of the ground truth of the experiment over the squashed states hypothesis.
As was mentioned in Sec. 1, the Bayesian test relies on the computation of probabilities of individual samples, which is a computationally hard task for patterns with a high number of clicks. This means that the test cannot be used to validate the entirety of the experimental data and, consequently, does not represent a complete measure to readily verify if a GBS experiment achieves quantum computational advantage. Rather, the Bayesian test is used to build up confidence in the correct functioning of the GBS setup by ruling out possible classical hypotheses explaining the samples.
We will compute the test for samples with fixed click numbers (following Refs. [1,2,21]) between 5 and 26. This upper limit in the number of clicks takes into account that the computation of the corresponding probabilities requires quadruple precision of complex type numbers, thus increasing the computational cost. Indeed, obtaining the probability for a single pattern of 25 clicks takes ∼ 3.5 hours using a 64-core CPU with two AMD Rome 7532 processors with 2.4 GHz clock speed, using a custom implementation of the Torontonian function (see Ref. [46]) which uses Quadrupleprecision (128 bits for each real number) provided by the library DoubleFloats.jl [47] in the Julia Programming language [48].
Consider a set S = {s 1 , . . . , s L } of L experimental samples, each of them containing C clicks. The proba-bility of obtaining one of these samples, given that it has C clicks, under the hypothesis HYP ∈ {SQUA, SQUE} is given by We define the Bayesian ratio, r B (C), which can be interpreted as the probability assigned to the ground truth hypothesis for a given number of clicks, as where χ B (C) = Pr (SQUA) (S|C)/Pr (SQUE) (S|C). The Bayesian test consists in checking the convergence of r B (C) when the number of samples is increased: if r B (C) → 1 for any C, we conclude that the ground truth hypothesis is more likely to describe the experimental samples. Conversely, if r B (C) → 0 for any C, the squashed states hypothesis becomes more likely. An alternative way to express this test is obtained by writing χ B = exp(L∆H(C)), where ln Pr (SQUE) (s k |C) The quantities H (SQUE) and H (SQUA) are estimators of the cross-entropy, for a given number of counts, of the ground truth and squashed states distributions relative to the real probability distribution of the experimental samples. In terms of the cross-entropy difference ∆H(C), r B (C) = [1 + exp(L∆H(C))] −1 and, for a increasing number of samples, the condition r B (C) → 1 is equivalent to ∆H(C) < 0, while r B (C) → 0 is equivalent to ∆H(C) > 0.
An important step for the computation of the Bayesian test is the determination of the grouped click probability distributions Pr (SQUE) (C) and Pr (SQUA) (C). To do this, we use the simulation method introduced in Ref. [29]. The definition of this method, as well as the resulting click probability distributions and the parameters used in the simulation, are shown in Appendix B. It is worth mentioning that this method allows the determination of Pr (SQUE) (C) and Pr (SQUA) (C) for all click numbers C, not only those for which the probabilities of individual samples are easily computed. Fig. 4 shows the results of the Bayesian test for the Jiuzhang 1.0 and Jiuzhang 2.0 experiments in terms of cross-entropy differences. For configurations with ν ≤ 0.218, we considered experimental samples with click numbers between 5 and 20. In these cases, we used 4000 samples for each C. For the ν = 0.786 and ν = 0.442 setups, we considered click numbers between 21 and 26 due to the lack of sufficient samples (less than 1000 for each click number) with C lower than 20. For the different configurations of the Jiuzhang 2.0 experiment, including the second brightest configuration (ν = 0.442), the Bayesian test consistently indicates that the ground truth is more likely to explain the experimental samples than the squashed states hypothesis. However, as was the case with the click cumulants comparison, by increasing ν, the values of ∆H(C) in the range of the click numbers considered are lower than zero by a progressively smaller margin, indicating that the degree of confidence in the squashed states hypothesis increases. For the ν = 0.786 configuration of Jiuzhang 1.0, the Bayesian test indicates that the squashed states hypothesis is more likely than the ground truth of the experiment.
The discrepancy between the results for the configurations ν = 0.442 and ν = 0.786, both belonging to the high photon number density regime, may be explained by the refinement of the light source of the Jiuzhang 2.0 experiment with respect to Jiuzhang 1.0. Indeed, a better preparation of the input two-mode squeezed states directly challenges the idea that the input states of the interferometers are better described by squashed states, which, as was mentioned earlier, are approximations to squeezed states with losses in their preparation.

HOG test
The Heavy Output Generation (HOG) test was first introduced in Ref. [1] for the validation of the Jiuzhang 1.0 experiment. This test compares the probabilities with respect to the ground truth distribution of two sets of samples: one corresponding to experimental samples, and the other corresponding to samples from an alternative distribution, which in our case corresponds to the squashed states distribution. This test verifies if a sampler using the alternative distribution can generate samples with higher ground truth probability than the experimental samples.
By relying on the computation of probabilities of individual samples, the HOG test has the same computational limitations as the Bayesian test. Namely, it can only be calculated for patterns with a small number of clicks. This means that the HOG test cannot be used to completely verify a GBS quantum advantage claim; it can only be used to rule out possible classical samplers and increase the confidence in the correct functioning of the experiment.
The definition of the HOG test is made in a similar fashion to that of the Bayesian test. We define the HOG ratio, r HOG (C), as where χ HOG (C) = Pr (SQUE) (S ′ |C)/Pr (SQUE) (S|C) and S ′ = {s ′ 1 , . . . , s ′ L } is a set of L samples obtained from the squashed states distribution (each one with  C clicks). Pr (SQUE) (S|C) is computed according to Eq. (19). We emphasize that in the last equation all the probabilities are taken with respect to the squeezing distribution (SQUE). The test consists in checking the convergence of r HOG (C) for an increasing number of samples: r HOG (C) → 1 if we generally find that the experimental samples have higher ground truth probability for any C, while r HOG (C) → 0 when the samples from the squashed states distribution have higher ground truth probability for any C.
In Fig. 5 we show the results for the HOG test, in terms of ∆E(C), for the Jiuzhang 1.0 and Jiuzhang 2.0 experiments. The parameters used in the calculation of the grouped click probabilities were the same as those used in the computation of the Bayesian test (see Appendix B). The ranges of click numbers considered here, as well as the corresponding L, were also the same as in the case of the Bayesian test. The results for the ν = 0.975 configuration of Jiuzhang 2.0 are not shown due to the lack of experimental samples in the range of click numbers considered.
As can be seen in Fig. 5, for the Jiuzhang 1.0 experiment, we get inconclusive results, as ∆E(C) > 0 for C = 22 but has a negative sign for the remaining click numbers (for the test to be conclusive ∆E(C) must have the same sign for all the C considered). For all the configurations of Jiuzhang 2.0, we consistently find that the experimental samples have higher ground truth probability than the squashed states samples, even for setups in the high photon number density regime, in agreement with the results of the Bayesian test. Additionally, contrary to the case of the Bayesian test, the cross-entropy difference ∆E(C) does not differ from zero by a progressively smaller margin with increasing ν.
It is important to mention that although the present model cannot generate samples that outperform the experimental results at the HOG test, a modified version of the squashed states hypothesis, which includes corrections to the input mean photon numbers and the transmission matrix of the interferometer (as proposed in Ref. [29]), might be able to generate classical samples with higher ground truth probability than the experimental samples. Additionally, this modified model might also lead to more accurate results for the Bayesian test and the comparison of click cumulants.

Discussion
In this work we proposed an alternative hypothesis for the validation of the Jiuzhang GBS experiments. This hypothesis is based on the probability distribution of classical mixtures of coherent states that we call squashed states. For the validation of the experimental results against this alternative hypothesis, we used the same methods as the authors of the experiments.
The results for the click cumulants comparison show that the theoretical cumulants predicted by the squashed states distribution are not consistent with the experimental results for setups with low photon number density, ν. However, it is noticeable that with increasing ν the squashed states cumulants become progressively consistent with those obtained from the experimental samples. This trend is confirmed by the results for the configurations with high photon number density, for which the theoretical cumulants predicted by the squashed states distribution are as compatible with the experimental results as those predicted by the ground truth of the experiment. These results suggest that the presence of cumulants (up to fourth order) in the experimental samples can be efficiently reproduced by classical hypotheses lacking any quantum correlation.
The results of the Bayesian test show differences with respect to those of the click cumulants comparison. By analyzing the cross-entropy difference ∆H(C) as a function of the number of clicks, we consistently find that, for the different configurations of the Jiuzhang 2.0 experiment, ∆H(C) < 0, indicating that the ground truth of the experiment is more likely to describe the experimental samples than the squashed states hypothesis. However, we note the tendency of ∆H(C) to approach zero with increasing photon number density, suggesting an increasing degree of confidence in squashed states distribution with respect to the ground truth. On the other hand, for the Jiuzhang 1.0 experiment, we find that ∆H(C) > 0, which indicates that the squashed states hypothesis is a better explanation of the experimental samples than the ground truth of the experiment. The contrast between the results of the Bayesian test for the configurations with ν = 0.442 and ν = 0.786, both being configurations with high photon number density, may be explained by the improvement of the light source of the Jiuzhang 2.0 relative to Jiuzhang 1.0.
We generated samples from the squashed states distribution and compared them against the experimental samples at the task of heavy output generation (HOG) from the squeezed states distribution. In this case we found that, except for a subset of the data in Jiuzhang 1.0, the experiments perform better at the HOG task than the samples from the squashed states hypothesis.
Our results provide a more nuanced picture of the different tools employed in arguing about quantum advantage in threshold GBS experiments. On one hand, we found that the presence of correlations up to fourth order in the experimental samples from Jiuzhang 1.0 and 2.0 in the high photon number density regime can be efficiently reproduced by states lacking any quantum features. On the other hand, we observe that, for some of the setups in this high ν regime, the squashed states hypothesis is not the better hypothesis for the explanation of the experimental samples. This result indicates that the comparison of correlation functions is not entirely reliable for the validation of quantum advantage claims in threshold GBS experiments.
Moreover, the samples generated using the squashed states hypothesis cannot spoof the HOG test when compared to the samples of the experiment, even for the configuration for which the Bayesian test indicates that the squashed states hypothesis is more likely to explain the experimental data (namely, for the Jiuzhang 1.0 experiment). Note, however, that other efficient classical methods have already outperformed the Jiuzhang 1.0 and 2.0 at this task [21]. Nevertheless, in absence of rigorous results about the significance of the HOG test for GBS (as opposed to RCS), the inconclusive result for Jiuzhang 1.0 must be taken with a grain of salt, especially when alternative hypotheses are more probable than the ground truth.
Our work provides a new adversary that should be considered against future GBS experiments and, perhaps more importantly, further motivates the need to identify proper metrics and optimal classical adversaries for quantum advantage in the context of threshold GBS.

A The Torontonian
As discussed in Sec. 2, the ground truth and squashed states distributions depend on the Torontonian [14] of a matrix constructed from the corresponding covariance matrices of the output states of the interferometer. For a 2N × 2N matrix A, the Torontonian, Tor(A), is computed as (Z) . (24) Here, Ω([N ]) is the power set, i.e., the set of all subsets, of [N ] = {1, 2, . . . , N }; |Z| stands for the cardinality of set Z, that is, the number of elements in Z; and I is the 2M × 2M identity matrix. If we write Z = {z 1 , . . . , z n }, with n = |Z|, then A (Z) is the matrix obtained from removing the rows and columns {z 1 , . . . , z n } and {z 1 + N, . . . , z n + N } from matrix A.

B Simulation of grouped click probabilities
In this appendix we describe the positive P -distribution simulation method [29] used in Secs. 3.2 and 3.3 for the computation of the ground truth and squashed states grouped click probability distributions.
Suppose that the input state of the interferometer in the Jiuzhang experiments is represented by the density operator ρ in . This state can be expanded in terms of its positive P -representation [25], P (α, β), which is a phase space representation over a subspace of the complex plane. The expansion reads where |α⟩ = |α 1 , . . . , α K ⟩ and |β⟩ = |β 1 , . . . , β K ⟩ are Kmode coherent states, with K being the number of input states of the interferometer. dµ(α, β) corresponds to an integration measure over the 2K-dimensional complex space of the amplitudes α = (α 1 , . . . , α K ) and β = (β 1 , . . . , β K ). The state of the system after the interferometer, ρ out , has a similar representation in terms of the same Pdistribution, we need only modify the multimode coherent states in Eq. (25): Here,ᾱ = T α andβ = T * β, where T is the M × K transfer matrix representing the action of the interferometer, where M is the number of output modes. Notice that |ᾱ⟩ and |β⟩ become now M -mode coherent states.
With this representation of the output state, we can compute the probability of obtaining a detection pattern s = (s 1 , . . . , s M ), with s i = 0, 1, (when the system is measured using threshold detectors) as Π s is the operator representing the measurement of pattern s, which can be expressed as where |0 k ⟩ is the vacuum state in the Hilbert state of mode k and I (k) the corresponding identity operator. The set C(s) is composed by the modes where a click is detected when the result of the measurement is pattern s. Conversely, the elements of set V(s) are the modes where no click was detected. After a direct computation of the trace, it can be shown that where Now, suppose that D(C) is the set of all detection patterns s with C clicks. Then, the probability of obtaining C clicks, Pr(C) = s∈D(C) Pr(s), takes the form By defining the function where θ l = 2πl/(M + 1), we can write and thus, the probability of obtaining C clicks can be simply written as The previous expression does not depend on the number elements of D(C), which is generally large for a large number of modes. This fact makes the computation of Pr(C) a feasible task for a classical computer. Notice that obtaining Pr(C) is equivalent to the computation of a sort of expected value of the func-tionG(ᾱ,β; C) with respect to distribution P (α, β). Thus, we can estimate Pr(C) numerically by generating N pairs of random vectors α k = (α as P (α, β); then, computing the correspondingG(ᾱ k ,β k ; C) functions; and finally, taking the real part of the average 1 N N k=1G (ᾱ k ,β k ; C). The random variables α k and β k will depend only on the initial state in consideration.
For an input state of K single-mode squeezed states (SMSS), we can generate the samples α k and β k in the following way [29]. First, we calculate the mean photon numbersn j = ⟨a † j a j ⟩ = sinh 2 (r j ) and coherences m j = ⟨a 2 j ⟩ = 1 2 sinh(2r j ) of the K input states. Here a † j and a j stand for the creation and annihilation operators of mode j and r j is the squeezing parameter corresponding to that same mode. We then generate N pairs of real Gaussian random vectors u k = (u . The components of these vectors must satisfy the relations E(u indicates the expectation value with respect to the Gaussian distribution. Finally, with this definition of u k and v k , we can write the components of α k and β k as To take into account that the input states of the Jiuzhang experiments are two-mode squeezed states (TMSS), we modify the transformation matrix that represents the action of the interferometer. As was discussed in Sec. 2, the input TMSS can be generated by sending SMSS into 50 : 50 beamsplitters, whose action is represented by a K×K matrix B. Following this idea, we apply the transformation induced by the beamsplitters before applying the transformation of the interferometer over the amplitudes α k and β k of the SMSS. Then, we can write the transformed amplitudesᾱ k and This procedure can be readily modified for the case of input squashed states, we need only notice that for these statesn k =m k = sinh 2 (r k ).
To assign an uncertainty to the estimation of Pr(C), we can sort the N pairs of vectors α k and β k in G groups, each one with M = N /G elements. We then compute Pr G (C) = 1 M M k=1G (ᾱ k ,β k ; C) for each group. Notice that the average of the Pr G (C) over all groups is equal to Pr(C). The uncertainty of the estimation of Pr(C) corresponds to the standard deviation of the Pr G (C) over all groups.
For the estimation of the ground truth and squashed states grouped click probability distributions, Pr (SQUE) (C) and Pr (SQUA) (C), we used N = 10 8 samples α k and β k . For the computation of the uncertainty, we sorted these samples in G = 100 groups. In Fig. 6 we show the obtained distributions for the different setups of the Jiuzhang experiments. In Table 2 we show the estimated mean number of clicks,C, and standard deviation σ(C) of these distributions using the grouped probabilities. They are in excellent agreement with the quantities reported in Table 1 obtained from analytical results.
The functions used for the computation of the grouped click probabilities are included in the library thewalrus [28] from version 0.20.0 onward.

C Some properties of squashed states
In this appendix we list some of the most important properties of the squashed states.
As was mentioned in Sec. 2, when losses are incorporated in the input states, squeezed states become squeezed thermal states. The covariance matrix of a single mode squeezed thermal state can be written in the form [22] σ (SQTH) = ℏ 2 1 + η(e −2r − 1) 0 0 1 + η(e 2r + 1) , (36) where r is the squeezing parameter, and η > 0 is a transmission rate parameter representing the losses. Notice that 1+η(e −2r −1) < 1 for all r, while 1+η(e 2r −1) > 1. This indicates that squeezed thermal states still have a quadrature with lower noise than the vacuum. Moreover, it implies that σ (SQTH) ≱ ℏ 2 I 2 , so these Gaussian states cannot be written as a classical mixture of coherence states (i.e. they are quantum Gaussian states).
Squashed states, on the other hand, are defined to have the variance of one of their quadratures equal to those of the vacuum, while the remaining quadrature has excess noise proportional to the mean number of photonsn of the state. The covariance matrix of a single mode squashed state can be written as Notice that σ (SQUA) ≥ ℏ 2 I 2 , making these states classical mixtures of coherent states (i.e. they are classical     Gaussian states). It is possible to show [22] that the squashed states are the classical Gaussian states with the closest fidelity to squeezed thermal states, making them great candidates to model the effect of losses in the input states of GBS experiments.
Using the methods of Refs. [49,50], it can be shown that the photon number distribution of a Gaussian state with covariance matrix σ (SQUA) is given by